Skip to main content

Point it at a repo, get back 'this is an e-commerce app that does X' — pattern-based application functional-category inference from routes, data models, and README.

Project description

app-classifier

Point it at a repo, get back what it is and how it works — deterministically, with optional LLM lift.

from app_classifier import classify_smart
result = classify_smart("./my-repo")
print(result.app_category, result.app_category_confidence)
print(result.functional_description)
e-commerce 0.95
ShopMax is an e-commerce application. Primary functionality: online shopping, internal admin.
The app routes traffic across 12 HTTP endpoints, including authentication and checkout.

Why it wins

Concern app-classifier Manual review ChatGPT paste GitHub Copilot Chat
Deterministic baseline ✅ pattern-based fingerprints
Works offline ✅ no network needed
Zero runtime deps ✅ stdlib only n/a n/a
Multi-language ✅ Python, JS/TS, Java, Go, Ruby, PHP, others ⚠️ ⚠️ ⚠️
Structured output ✅ dataclasses, JSON-serializable ⚠️ free text ⚠️ free text
Programmatic API classify() / classify_smart() / classify_agentic() / map_code()
LLM-pluggable ✅ 5 adapters (OpenAI / Anthropic / OpenRouter / Ollama / OpenAI-compat) bound to one bound to one
Auditable ✅ confidence scores + step-by-step trail

If you've ever inherited a 200-file repo and spent an afternoon working out "wait, what does this thing actually do?" — that's the problem this solves.


Installation

pip install app-classifier

Zero runtime dependencies. The LLM adapters use stdlib urllib. If you'd rather use the official SDK clients:

pip install app-classifier[openai]        # adds openai SDK
pip install app-classifier[anthropic]     # adds anthropic SDK
pip install app-classifier[all]           # both

Requires Python 3.10+.


Quick start

1. Pure rule-based (no network, no deps, no setup)

from app_classifier import classify
result = classify("./my-repo")
print(result.app_category, result.app_category_confidence)
print(result.functional_description)
print([r.path for r in result.routes[:5]])

2. Smart — rule-based first, LLM-escalated when uncertain

import os
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
from app_classifier import classify_smart
result = classify_smart("./my-repo")  # auto-detects provider from env
print(result.app_category, result.app_category_confidence)

High-confidence repos (≥0.75) return immediately with no LLM call. Only ambiguous ones get the agentic treatment.

3. Code mapping for impact analysis (no LLM)

from app_classifier import map_code
cm = map_code("./my-repo")
print("Entry points:", cm.entry_points)
print("If I change src/auth.py:", cm.impact_of("src/auth.py"))

impact_of() does BFS over the reverse dependency graph with cycle detection. Works across Python, JS/TS, Java, Go, Ruby, PHP at the file level; Python also gets function-level resolution.


Configuration

Option A: environment variables (simplest)

export APP_CLASSIFIER_LLM_PROVIDER=anthropic     # picks the provider
export ANTHROPIC_API_KEY=sk-ant-...               # per-provider key

classify_smart() will autodetect this. The provider name maps to one of: openai, anthropic, openrouter, ollama, openai_compat.

Option B: ~/.app-classifier/providers.json

{
  "default": "anthropic",
  "providers": {
    "anthropic": {
      "type": "anthropic",
      "api_key": "${ANTHROPIC_API_KEY}",
      "model": "claude-haiku-4-5-20251001"
    },
    "local": {
      "type": "openai_compat",
      "base_url": "http://localhost:1234/v1",
      "model": "llama-3.2-3b"
    }
  }
}

${VAR} placeholders are interpolated from the environment at load time.

Option C: explicit instantiation

from app_classifier import classify_smart, OpenAIProvider
provider = OpenAIProvider(api_key="sk-...", model="gpt-4o-mini")
result = classify_smart("./my-repo", llm_provider=provider)

See the per-provider guide for full quickstarts (Groq, LM Studio, Ollama, vLLM, llama.cpp, Together, Fireworks).


API reference

Symbol Module Purpose
classify(repo) app_classifier Rule-based, deterministic, no network
classify_smart(repo) app_classifier Rule-first; LLM-escalated for low-confidence
classify_smart_async(repo) app_classifier Async variant; returns full audit trail
classify_agentic(repo, llm_provider) app_classifier LLM tool-loop (low level — classify_smart wraps this)
map_code(repo) app_classifier Build a CodeMap for impact analysis
CodeMap.impact_of(target) app_classifier BFS over reverse-dep graph
OpenAIProvider(api_key, model, base_url=None) app_classifier OpenAI + any OpenAI-compatible endpoint
AnthropicProvider(api_key, model) app_classifier Anthropic Messages API
OpenRouterProvider(api_key, model) app_classifier OpenRouter (auto-routed across providers)
OllamaProvider(host, model) app_classifier Local Ollama serving
OpenAICompatProvider(base_url, model, api_key=None) app_classifier LM Studio / vLLM / llama.cpp / Groq / Together
load_provider(name=None) app_classifier Resolve provider from env + config
analyze_hosting_requirements(repo) app_classifier Runtime / DB / port / env-var detection

Full dataclass shapes: AppDescription, RouteEntry, DataModel, CodeMap, FileNode, FunctionNode, AgentClassificationResult, AgentStep, SubappClassification, HostingReport, Signal.


CLI

app-classifier ./my-repo                   # human-readable summary
app-classifier ./my-repo --json            # JSON for piping

What it can't do (yet)

  • Java / JS / Go function-call graph (file-level dep only; v0.6.0+)
  • Streaming completion (provider layer always waits for the full response)
  • Cost / token caps per call
  • Tree-sitter-backed parsing (everything is stdlib + regex right now — fast, but imperfect)

See the CHANGELOG for what landed in each release. The code-mapping guide has impact-analysis recipes.


License

MIT. Contributions welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

app_classifier-0.5.1.tar.gz (65.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

app_classifier-0.5.1-py3-none-any.whl (58.3 kB view details)

Uploaded Python 3

File details

Details for the file app_classifier-0.5.1.tar.gz.

File metadata

  • Download URL: app_classifier-0.5.1.tar.gz
  • Upload date:
  • Size: 65.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for app_classifier-0.5.1.tar.gz
Algorithm Hash digest
SHA256 4f0374cd591880ca31d408ea4287a9e73c3cb6f4d36b490eb584cfa3a61782f1
MD5 6e678f7989ad5d8cb4654fbc0ea97054
BLAKE2b-256 634b6a4bc5d661481c62f5b4b9b9411b99c675824d060565ec43aa4f3d70f057

See more details on using hashes here.

File details

Details for the file app_classifier-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: app_classifier-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 58.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for app_classifier-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 011e76feef42c86d10800b339e1dd6260949d88d8fce289ec01f4084b54ddc29
MD5 21da8aa0d28754e72ec65c80552fe22b
BLAKE2b-256 e3be8cde5b88a19b32f049d308d2700299e4bd369d492f06c117bd0df8c615c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page