Static code analyzer for any repository — classify codebase, extract HTTP routes, detect tech stack, map dependency graph. Multi-language (Python, JS, Java, Go, Ruby, PHP). Zero dependencies. Optional LLM enrichment.

These details have not been verified by PyPI

Project links

Project description

app-classifier — instantly understand any codebase

A Python library and CLI tool that reads any source repository and tells you exactly what it is, what it does, what stack it runs on, and which files depend on which. Point app-classifier at a GitHub repo, a local folder, or an unfamiliar codebase, and get back structured answers in under a second — no LLM required.

Built for developers, code reviewers, AI coding assistants, security auditors, and engineering teams onboarding into legacy systems.

from app_classifier import classify_smart
result = classify_smart("./my-repo")
print(result.app_category, result.app_category_confidence)
print(result.functional_description)

e-commerce 0.95
ShopMax is an e-commerce application. Primary functionality: online shopping, internal admin.
The app routes traffic across 12 HTTP endpoints, including authentication and checkout.

What it does

Classifies any repository into one of 32 application categories — e-commerce, blog, admin panel, REST API, AI / LLM application, FinTech, healthcare, marketplace, CMS, DevOps tooling, developer tooling / CLI, crypto / Web3, streaming, ML pipeline, mobile, IoT, and 16 more.
Extracts every HTTP route from Flask, FastAPI, Django, Express, Spring, Struts, and other web frameworks.
Identifies the tech stack — language, runtime version, framework, databases, caches, message queues, deployment target.
Maps the codebase — file-level dependency graph and Python function-call graph with cycle-safe impact analysis ("if I change this file, what breaks?").
Detects data models — JPA, SQLAlchemy, Django ORM, Mongoose, Doctrine, Entity Framework.
Analyzes hosting requirements — Dockerfile, ports, env vars, web-server CVEs.
Multi-language coverage — Python, JavaScript, TypeScript, Java, Go, Ruby, PHP, Rust, .NET, Elixir, Dart, Kotlin, Swift.

Why people use it

Codebase onboarding — Inherit a 200-file repo and need to know what it does in 30 seconds.
AI coding assistant context — Pre-brief Cursor, Claude Code, Continue, GitHub Copilot, or Aider before they touch an unknown repo. Cleaner prompts produce better code.
Code review automation — Auto-summarize unfamiliar PRs and surface the blast radius of a change.
Tech stack detection — Build an internal inventory of every service in your org.
Security scanning — Identify deployment surface area, dependency graph entry points, and high-risk handlers.
Documentation generation — Bootstrap a README for a project that doesn't have one.
OSS discovery — Programmatically search for and categorize repositories matching a pattern.

Why it wins

Concern	app-classifier	Manual review	ChatGPT paste	GitHub Copilot Chat
Deterministic baseline	✅ pattern-based fingerprints	—	❌	❌
Works offline	✅ no network needed	✅	❌	❌
Zero runtime deps	✅ stdlib only	✅	n/a	n/a
Multi-language	✅ Python, JS/TS, Java, Go, Ruby, PHP, others	⚠️	⚠️	⚠️
Structured output	✅ dataclasses, JSON-serializable	❌	⚠️ free text	⚠️ free text
Programmatic API	✅ `classify()` / `classify_smart()` / `classify_agentic()` / `map_code()`	❌	❌	❌
LLM-pluggable	✅ 5 adapters (OpenAI / Anthropic / OpenRouter / Ollama / OpenAI-compat)	—	bound to one	bound to one
Auditable	✅ confidence scores + step-by-step trail	❌	❌	❌

If you've ever inherited a 200-file repo and spent an afternoon working out "wait, what does this thing actually do?" — that's the problem this solves.

Installation

pip install app-classifier

Zero runtime dependencies. The LLM adapters use stdlib urllib. If you'd rather use the official SDK clients:

pip install app-classifier[openai]        # adds openai SDK
pip install app-classifier[anthropic]     # adds anthropic SDK
pip install app-classifier[all]           # both

Requires Python 3.10+.

Quick start

1. Pure rule-based (no network, no deps, no setup)

from app_classifier import classify
result = classify("./my-repo")
print(result.app_category, result.app_category_confidence)
print(result.functional_description)
print([r.path for r in result.routes[:5]])

2. Smart — rule-based first, LLM-escalated when uncertain

import os
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
from app_classifier import classify_smart
result = classify_smart("./my-repo")  # auto-detects provider from env
print(result.app_category, result.app_category_confidence)

High-confidence repos (≥0.75) return immediately with no LLM call. Only ambiguous ones get the agentic treatment.

3. Code mapping for impact analysis (no LLM)

from app_classifier import map_code
cm = map_code("./my-repo")
print("Entry points:", cm.entry_points)
print("If I change src/auth.py:", cm.impact_of("src/auth.py"))

impact_of() does BFS over the reverse dependency graph with cycle detection. Works across Python, JS/TS, Java, Go, Ruby, PHP at the file level; Python also gets function-level resolution.

Configuration

Option A: environment variables (simplest)

export APP_CLASSIFIER_LLM_PROVIDER=anthropic     # picks the provider
export ANTHROPIC_API_KEY=sk-ant-...               # per-provider key

classify_smart() will autodetect this. The provider name maps to one of: openai, anthropic, openrouter, ollama, openai_compat.

Option B: `~/.app-classifier/providers.json`

{
  "default": "anthropic",
  "providers": {
    "anthropic": {
      "type": "anthropic",
      "api_key": "${ANTHROPIC_API_KEY}",
      "model": "claude-haiku-4-5-20251001"
    },
    "local": {
      "type": "openai_compat",
      "base_url": "http://localhost:1234/v1",
      "model": "llama-3.2-3b"
    }
  }
}

${VAR} placeholders are interpolated from the environment at load time.

Option C: explicit instantiation

from app_classifier import classify_smart, OpenAIProvider
provider = OpenAIProvider(api_key="sk-...", model="gpt-4o-mini")
result = classify_smart("./my-repo", llm_provider=provider)

See the per-provider guide for full quickstarts (Groq, LM Studio, Ollama, vLLM, llama.cpp, Together, Fireworks).

API reference

Symbol	Module	Purpose
`classify(repo)`	`app_classifier`	Rule-based, deterministic, no network
`classify_smart(repo)`	`app_classifier`	Rule-first; LLM-escalated for low-confidence
`classify_smart_async(repo)`	`app_classifier`	Async variant; returns full audit trail
`classify_agentic(repo, llm_provider)`	`app_classifier`	LLM tool-loop (low level — `classify_smart` wraps this)
`map_code(repo)`	`app_classifier`	Build a `CodeMap` for impact analysis
`CodeMap.impact_of(target)`	`app_classifier`	BFS over reverse-dep graph
`OpenAIProvider(api_key, model, base_url=None)`	`app_classifier`	OpenAI + any OpenAI-compatible endpoint
`AnthropicProvider(api_key, model)`	`app_classifier`	Anthropic Messages API
`OpenRouterProvider(api_key, model)`	`app_classifier`	OpenRouter (auto-routed across providers)
`OllamaProvider(host, model)`	`app_classifier`	Local Ollama serving
`OpenAICompatProvider(base_url, model, api_key=None)`	`app_classifier`	LM Studio / vLLM / llama.cpp / Groq / Together
`load_provider(name=None)`	`app_classifier`	Resolve provider from env + config
`analyze_hosting_requirements(repo)`	`app_classifier`	Runtime / DB / port / env-var detection

Full dataclass shapes: AppDescription, RouteEntry, DataModel, CodeMap, FileNode, FunctionNode, AgentClassificationResult, AgentStep, SubappClassification, HostingReport, Signal.

CLI

app-classifier ./my-repo                   # human-readable summary
app-classifier ./my-repo --json            # JSON for piping

What it can't do (yet)

Java / JS / Go function-call graph (file-level dep only; v0.6.0+)
Streaming completion (provider layer always waits for the full response)
Cost / token caps per call
Tree-sitter-backed parsing (everything is stdlib + regex right now — fast, but imperfect)

See the CHANGELOG for what landed in each release. The code-mapping guide has impact-analysis recipes.

License

MIT. Contributions welcome.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.5.3

May 25, 2026

0.5.2

May 24, 2026

0.5.1

May 22, 2026

0.5.0

May 22, 2026

0.2.0

May 21, 2026

0.1.0

May 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

app_classifier-0.5.3.tar.gz (69.1 kB view details)

Uploaded May 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

app_classifier-0.5.3-py3-none-any.whl (60.4 kB view details)

Uploaded May 25, 2026 Python 3

File details

Details for the file app_classifier-0.5.3.tar.gz.

File metadata

Download URL: app_classifier-0.5.3.tar.gz
Upload date: May 25, 2026
Size: 69.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for app_classifier-0.5.3.tar.gz
Algorithm	Hash digest
SHA256	`d1333c9237e2cab5daac85b8322d15f60e190d161125ca51b487f52b438cbc1c`
MD5	`49a4ebad7ba66a555059cfb2b4634b75`
BLAKE2b-256	`5c76cab362f5470dcf150a3516bfa443582c2f190d0a1edec5243d842a0fa26a`

See more details on using hashes here.

File details

Details for the file app_classifier-0.5.3-py3-none-any.whl.

File metadata

Download URL: app_classifier-0.5.3-py3-none-any.whl
Upload date: May 25, 2026
Size: 60.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for app_classifier-0.5.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`37a0d0ca4384627b9ef956f1842b45a8033f81b778517b0c385bc9c3e419f4a5`
MD5	`4d0783a56449e4825e60f934c30260c5`
BLAKE2b-256	`4753aa8e15e29e083d23108f09bd6aec273c5c639e66a0098872d72e402308ab`

See more details on using hashes here.

app-classifier 0.5.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

app-classifier — instantly understand any codebase

What it does

Why people use it

Why it wins

Installation

Quick start

1. Pure rule-based (no network, no deps, no setup)

2. Smart — rule-based first, LLM-escalated when uncertain

3. Code mapping for impact analysis (no LLM)

Configuration

Option A: environment variables (simplest)

Option B: ~/.app-classifier/providers.json

Option C: explicit instantiation

API reference

CLI

What it can't do (yet)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Option B: `~/.app-classifier/providers.json`