Static analysis tool that maps Python codebases into navigable flows

These details have not been verified by PyPI

Project description

CARTOGRAPH

Scan any Python codebase. See every flow. Pipe to Claude.

Cartograph is a static analysis tool that maps codebases into navigable flows — entry points, call chains, conditional branches, cross-file dependencies. It discovers entry points from graph topology (not hardcoded decorators), builds a type-aware call graph, and outputs structured context that reduces LLM token usage by 100-280x.

pip install cartograph-code

carto scan ./your-project             # scan once, cached after
carto entries                         # list all entry points
carto trace "checkout"                # trace a call tree
carto context | claude                # pipe to any LLM
carto context "deploy" | claude       # scoped flow context

Real Output — Parsing Open Source Projects

Sentry (Django + Celery — 40K stars, custom framework abstractions)

$ cartograph summary ./sentry/src

Modules:        4,415
Functions:      30,926
Entry points:   788 (52 via detectors + 736 topology-discovered)
Resolved calls: 37,659

Sentry uses custom decorators (@instrumented_task, @cell_silo_endpoint) that no static analyzer knows about. Cartograph's topology-based discovery finds them anyway — 788 entry points without a single line of Sentry-specific code.

Polar (FastAPI — billing/subscriptions platform)

$ cartograph summary ./polar/server

Modules:        914
Functions:      6,350
Entry points:   600
Resolved calls: 6,327

$ cartograph trace ./polar/server "get_claim_info" --depth 2

Found: polar.customer_seat.endpoints.get_claim_info
File: polar/customer_seat/endpoints.py:323
Outgoing calls: 7 (7 cross-file, 0 async)

get_claim_info polar/customer_seat/endpoints.py:323
├── → SeatService.get_seat_by_token polar/customer_seat/service.py
│   ├── → CustomerSeatRepository.get_by_invitation_token polar/customer_seat/repository.py
│   ├── → CustomerSeatRepository.get_eager_options polar/customer_seat/repository.py
│   ├── ├─ if not seat or seat.is_revoked() or seat.is_claimed()
│   └── ├─ if seat.invitation_token_expires_at and seat.invitation_token_e...
├── → SeatService.check_seat_feature_enabled polar/customer_seat/service.py
│   ├── → OrganizationRepository.get_by_id polar/organization/repository.py
│   ├── → FeatureNotEnabled
│   ├── ├─ if not organization
│   │   └── → FeatureNotEnabled
│   └── ├─ if not organization.feature_settings.get('seat_based_pricing_en...
│       └── → FeatureNotEnabled
├── → OrganizationRepository.get_by_id polar/organization/repository.py
├── → SeatClaimInfo, → ResourceNotFound (×3, conditional)
├── ├─ if not seat → ResourceNotFound
├── ├─ else → ResourceNotFound (no subscription/order)
├── ├─ if not organization → ResourceNotFound
└── ├─ if seat.email / elif seat.member / elif seat.customer

Reachable: 6 functions across 5 files

Dagster (orchestration framework — 12K stars, zero framework detectors)

$ cartograph summary ./dagster/python_modules/dagster/dagster

Modules:        790
Functions:      11,533
Entry points:   255 (all topology-discovered: @public, @schedule_cli.command, @job_cli.command...)
Resolved calls: 6,919

Zero Dagster-specific code in Cartograph. Every entry point found via graph topology.

Prefect (orchestration framework — 20K stars)

$ cartograph summary ./prefect/src/prefect

Modules:        690
Functions:      6,280
Entry points:   396 (183 FastAPI routes + 213 topology-discovered)
Resolved calls: 2,821

How Entry Point Discovery Works

Cartograph discovers entry points two ways:

Framework detectors — recognize @app.get, @shared_task, @receiver, etc. Produce rich labels ("GET /api/users", "Celery task: send_email").
Topology discovery — after the call graph is built, find functions with zero incoming edges + outgoing calls + a decorator. These are functions the framework calls but no project code calls. Works on any framework without configuration.

Framework detectors are optional enrichment. Topology does the heavy lifting.

Symbols

→ = function call | ⚡ = async boundary (Celery) | ├─ = conditional branch | ↻ = cycle detected

The Problem

You open a new codebase. 200 files. Where do you start?

The file tree tells you nothing. Which functions matter? In what order do they execute? What triggers them?

You Cmd+Click through function calls. You grep. You read 10 files to understand 1 flow. Three days later you have a partial mental model that's already outdated.

Code structure ≠ code story. Files and classes are the architecture of the code. Flows and stories are the architecture of the system. CARTOGRAPH bridges the gap.

What It Does

Scans any codebase and discovers entry points — via framework detectors (FastAPI, Flask, Django, Celery) AND framework-agnostic topology discovery (works on any framework without configuration)
Builds a global call graph with cross-file import resolution, parameter type inference, factory classmethod tracking, MRO walking, and return type inference
Traces code flows from any function as a DAG with branch detection and async boundary marking
Renders interactive DAGs in the browser — click, expand, search, zoom across your entire codebase
Exports to JSON for downstream consumption (VS Code extension, CI pipelines)

Quick Start

# Install
pip install cartograph-code

# Scan any Python project (first time parses everything, then cached)
carto scan /path/to/your/project

# Everything below uses the cache — instant, no path needed
carto entries                          # list all entry points
carto entries --type api_route         # filter by type
carto search "checkout"                # find functions by name
carto trace "CheckoutService.create"   # trace call tree
carto trace "send_webhook" --depth 5   # control depth
carto callers "UserService.create"     # who calls this?
carto summary                          # stats overview

# Pipe to any LLM — no API keys needed
carto context | claude "what does this codebase do"
carto context "deploy" | claude "explain the deploy flow"
carto context "checkout" | gh copilot explain

# Or use built-in LLM (needs API key or local Ollama)
export CARTOGRAPH_LLM_PROVIDER=ollama
carto explain                          # explain whole codebase
carto explain "checkout"               # explain specific flow

Web Viewer

Launch an interactive browser-based DAG explorer for any Python project:

# Launch the web viewer
carto serve /path/to/your/project --port 3333

# Open in browser: http://127.0.0.1:3333

What happens:

Parses all .py files in the target project (takes 1-3s for ~3000 functions)
Builds a global call graph with cross-file resolution
Starts a local web server
Open the browser — click entry points in the sidebar to render interactive DAGs

Features:

Three-panel layout: sidebar (entry points) | DAG canvas (D3 + dagre) | detail panel (on click)
Color-coded nodes: blue = API routes, purple = Celery tasks, amber = signal handlers
Dashed edges = async dispatch, thick edges = cross-file calls
Click node for details (file, line, decorators, callers/callees)
Double-click node to re-root the graph at that function
[+] button on leaf nodes to expand deeper
Depth slider (1-8) to control how deep the trace goes
Search across all functions with / keyboard shortcut
Zoom and pan with mouse/trackpad

# Examples
carto serve ./polar/server --port 3333         # 600 entry points
carto serve ./sentry/src --port 3333           # 788 entry points
carto serve ./your-project --port 4000 --host 0.0.0.0

Pipe to Any LLM

Cartograph outputs structured context that any LLM can consume. No API keys, no provider lock-in — pipe to whatever you already use.

# Codebase-level: "what is this project?"
carto context | claude "what does this codebase do"

# Scoped: "explain this specific flow"
carto context "deploy" | claude "explain the deploy flow step by step"

# Works with any LLM CLI
carto context "checkout" | gh copilot explain
carto context | llm "summarize the architecture"

Token reduction: Prefect's raw codebase is ~9M tokens. carto context outputs ~8K tokens — a 1,000x reduction — while preserving every entry point, domain grouping, top callers, and package structure. The LLM gets the map, not the territory.

Built-in LLM support (optional — for carto explain without piping):

Provider	Env Vars	Default Model
Claude	`ANTHROPIC_API_KEY`	claude-sonnet-4-20250514
OpenAI	`OPENAI_API_KEY`	gpt-4o-mini
Ollama	`OLLAMA_HOST` (optional)	llama3.2

Architecture

CARTOGRAPH is built as six decoupled layers:

┌────────────────┐
│  User Interface │  CLI (today) → VS Code Extension → Web UI
└───────┬────────┘
        ▼
┌────────────────┐
│   Core API      │  init / trace / summary / query / diff
└───────┬────────┘
        ▼
┌────────┬──────────┬──────────┐
│ Parse  │  Graph   │   LLM    │
│ Layer  │  Layer   │  Layer   │
└────────┴──────────┴──────────┘

Layer	Job	Scales by
Parse	Extract functions, calls, decorators from source	Add language parsers as plugins (Python today, Java/Go/JS next)
Graph	Build call graph, resolve cross-file calls, construct flow DAGs	Universal — language-agnostic
LLM	Narrate flows from graph + source code	Pluggable providers — Claude, OpenAI, Ollama (or none)
Cache	Incremental re-analysis on file changes	File-hash based invalidation
Render	DAG → visual output	CLI / VS Code / Web / Mermaid / JSON

Key design decision: The graph layer never changes when you add a new language or framework. Language parsers and framework detectors are plugins. Adding Java means writing languages/java/adapter.py + languages/java/frameworks/spring_boot.py — the graph engine, serializer, and CLI stay untouched.

Full HLD: docs/hld.md | Parser HLD: docs/parser-hld.md

Currently Supported

Feature	Status
Engine
Cross-file call resolution via import analysis	✅
Type-inferred method resolution (`x = Foo(); x.bar()`)	✅
Factory classmethod resolution (`x = Foo.create(); x.bar()`)	✅
Parameter type resolution (`def f(x: Foo): x.bar()`)	✅
Return type resolution (`x = get_foo(); x.bar()` where `get_foo() -> Foo`)	✅
`self.method()` resolution with MRO/inheritance walking	✅
Topology-based entry point discovery (framework-agnostic)	✅
Conditional branch detection	✅
Cycle detection	✅
Language: Python
Python AST parsing	✅
Django Ninja route detection	✅
Celery task + async dispatch (`.delay()`, `chain`, `chord`, `group`)	✅
Django signal handler detection	✅
Django ORM operation annotation	✅
FastAPI route detection (`@app.get`, `@router.post`, `@app.websocket`)	✅
Flask route detection (`@app.route`, `@bp.get`, `@app.errorhandler`)	✅
Interfaces
Interactive web viewer (`cartograph serve`)	✅
CLI with Rich tree output	✅
JSON export	✅
122 unit tests + integration tests	✅
LLM Narration
`cartograph explain` — AI-powered flow narration	✅
Claude, OpenAI, Ollama provider support	✅
Web viewer `/api/narrate/{qname}` endpoint	✅
Planned
Diff mode ("what flows changed in this PR?")	📋 Phase 2
Tree-sitter migration (multi-language foundation)	📋 Phase 3
Java + Spring Boot	📋 Phase 3
Go + goroutine boundary detection	📋 Phase 3
TypeScript + Express/Nest	📋 Phase 3
VS Code extension	📋 Phase 3

Tested Against

Project	Framework	Modules	Functions	Entry Points	Resolved Edges
Sentry	Django + Celery (custom)	4,415	30,926	788	37,659
Polar	FastAPI	914	6,350	600	6,327
Prefect	FastAPI + custom	690	6,280	396	2,821
Dagster	Custom framework	790	11,533	255	6,919
paperless-ngx	Django + Celery	135	1,559	26	1,099

Sentry and Dagster use entirely custom decorator patterns — no Cartograph-specific detectors exist for them. Entry points discovered via graph topology.

122 unit tests passing.

Roadmap

Phase 1 (complete): Python parser + call graph + type inference + CLI + web viewer Phase 2 (complete): LLM narration + FastAPI/Flask detectors + principled type resolution + topology-based entry point discovery Phase 3: Diff mode ("what flows changed in this PR?") + blast radius analysis Phase 4: Multi-language via Tree-sitter (Java/Spring Boot, Go, TypeScript) + VS Code extension Phase 5: CI integration, multi-repo linking, team features

Why Not Just Ask an LLM?

You can ask Claude Code or Cursor to "trace the equipment pipeline." It will read some files, pattern-match, and give you a plausible answer. But:

LLMs sample. Cartograph enumerates. An LLM reads files until it runs out of context. Cartograph parses every file, resolves every import, builds the complete graph. 30K functions in 3 seconds — exhaustive, not best-effort.
LLMs hallucinate edges. Cartograph proves them. An LLM might say A calls B when it actually calls C. Cartograph resolves calls through the import chain — if it says A→B, that edge exists in the source.
LLMs need context. Cartograph provides it. Instead of feeding 9M tokens of raw code to an LLM, pipe 8K tokens of structured context: carto context | claude. The LLM gets the map — every entry point, every domain, every flow — in one page.
LLMs forget. Cartograph is deterministic. Same codebase, same graph, every time.

Cartograph builds the map. The LLM narrates it. They're complementary — carto context | claude gives your LLM grounded, exhaustive structural knowledge instead of best-effort file sampling.

Comparison

Tool	What it does	Where Cartograph differs
VS Code Call Hierarchy	Shows callers/callees of one function	No cross-file DAG, no async detection, no branches
Sourcegraph	Code search and navigation	Finds code, doesn't map flows
Claude Code / Cursor	LLM reads files and explains	Probabilistic, partial. Cartograph is exhaustive and deterministic — then feeds the LLM
GraphRAG / text-to-graph	Compresses text into graph for LLM context	Cartograph parses actual code structure, not text. Edges are proven, not inferred

License

MIT

LLMs guess. Cartograph proves.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.3

Apr 18, 2026

0.1.2

Apr 18, 2026

This version

0.1.1

Apr 18, 2026

0.1.0

Apr 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cartograph_code-0.1.1.tar.gz (55.3 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cartograph_code-0.1.1-py3-none-any.whl (62.2 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file cartograph_code-0.1.1.tar.gz.

File metadata

Download URL: cartograph_code-0.1.1.tar.gz
Upload date: Apr 18, 2026
Size: 55.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for cartograph_code-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`5d8ee947ed41507687157be741294ddbbdc5b6e8336d7b8669f215dd1023517b`
MD5	`c6fe0b1fec29625a6fca2fe9dc7eece0`
BLAKE2b-256	`a146e8fb6bce28c36eab8fc5bba9061ac22cfcb0cf1898c4dbee07043f79b460`

See more details on using hashes here.

File details

Details for the file cartograph_code-0.1.1-py3-none-any.whl.

File metadata

Download URL: cartograph_code-0.1.1-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 62.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for cartograph_code-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8c970781ffb108d3c33e0bc0cd7ab037b4e65887930d7a29fcd17183140615ed`
MD5	`e7d2a272c2d11e2ed24cc3e82e53c608`
BLAKE2b-256	`0bf107e4519a1ee198b545281d22e1cc202af9fbc2dd1488dc905d45104dbf06`

See more details on using hashes here.

cartograph-code 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

CARTOGRAPH

Real Output — Parsing Open Source Projects

Sentry (Django + Celery — 40K stars, custom framework abstractions)

Polar (FastAPI — billing/subscriptions platform)

Dagster (orchestration framework — 12K stars, zero framework detectors)

Prefect (orchestration framework — 20K stars)

How Entry Point Discovery Works

Symbols

The Problem

What It Does

Quick Start

Web Viewer

Pipe to Any LLM

Architecture

Currently Supported

Tested Against

Roadmap

Why Not Just Ask an LLM?

Comparison

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes