Local-first, MCP-native code-intelligence: dead code, holes, orientation, impact — ranked by what's live, every answer carrying confidence.
Project description
stitchgraph
Local-first code intelligence for humans and LLM agents. Point stitchgraph at a codebase and ask it plain questions — what's dead? what breaks if I change this? how does a request flow end to end? which tests should I run? It indexes 12 languages into a single SQLite graph on your machine, answers through three identical surfaces (Python library, CLI, MCP server), and attaches a confidence, a provenance, and a reason to double-check to every answer, so you always know how much to trust it.
Two design commitments make it different:
- It never guesses confidently. Every result rides a universal envelope
(
confidence / provenance / needs_review / urgency), and the cardinal rule — live code is never confidently flagged dead — biases every liveness decision toward precision. When stitchgraph isn't sure, it says so and tells you why. - It measures what code does, not just what it says. Beyond the static graph, the behavioural toolkit decomposes a per-test coverage matrix (POD/SVD) into your suite's runtime behavioural modes — how many independent behaviours you actually test, which 6% of tests cover everything, which functions co-run with no static link. These are answers no amount of reading source can produce.
Everything runs offline against a plain SQLite file. stitchgraph is read-only on your code — it writes only to its own index, never executes your project, and every finding is advisory: ranked options for a human or agent to act on.
Contents
- Install
- Five-minute quickstart
- The operations
- The behavioural toolkit (runtime analysis)
- For LLM agents (MCP)
- Trust model
- Languages
- Scale
- Develop
Install
pip install stitchgraph # library only (stdlib core, Python analysis)
pip install 'stitchgraph[cli]' # + the `stitchgraph` command
pip install 'stitchgraph[mcp]' # + the `stitchgraph-mcp` server for agents
pip install 'stitchgraph[all]' # everything below
| Extra | Unlocks |
|---|---|
cli |
The stitchgraph command (Typer) |
mcp |
The stitchgraph-mcp server for LLM agents (FastMCP) |
treesitter |
The other 11 languages, with bundled offline grammars (CI/air-gap safe) |
treesitter-download |
Same, but fetches the newest grammars on first use |
precise |
jedi type-grade Python resolution (reindex --precise) |
resolve |
SQL statement resolution (sqlglot) — powers full-stack traces into tables |
algebra |
GraphBLAS-accelerated whole-graph sweeps (pure-Python fallback built in) |
spectral |
scipy sparse solvers — uncaps find_subsystems / find_modes on large repos |
Run stitchgraph doctor (add --strict in CI) to check which grammars load.
Five-minute quickstart
Index once, then ask questions. The index is a single SQLite file; re-run reindex
after large changes (or leave stitchgraph watch . running).
cd your-project
stitchgraph reindex . --db stitchgraph.db # build the graph (12 languages, one pass)
stitchgraph orient --db stitchgraph.db # new here? counts, entry points, top hubs
stitchgraph find-stale --db stitchgraph.db # likely-dead code, precision-biased
stitchgraph scan --db stitchgraph.db # ranked issues: stubs, holes, cycles, god objects
stitchgraph impact-of UserService --db stitchgraph.db # blast radius + which tests to run
stitchgraph trace-path loadUsers users --db stitchgraph.db # full-stack: JS → route → SQL table
stitchgraph report --db stitchgraph.db # one Markdown report of all of the above
Every command takes --json for the raw envelope (machine-readable, full payload —
text output truncates long lists). Exit codes: 0 clean, 1 RED findings exist,
2 operational failure (missing/unopenable --db) — safe to gate CI on.
The same operations, as a library:
import stitchgraph as sg
with sg.Store("stitchgraph.db") as store:
sg.reindex(store, ".")
print(sg.find_stale(store)) # every result is a Result envelope:
print(sg.impact_of(store, "UserService")) # .ok .result .confidence .needs_review
The operations
Thirty-one operations, one question each. All advisory, all read-only, all carrying the envelope.
| Ask | Operation(s) |
|---|---|
| Where is X, who calls it, what does it call? | find_symbol, get_callers, get_callees |
| I'm new here — orient me | orient, summarize_subsystem, find_subsystems |
| What's dead? What's referenced but missing? | find_stale, find_holes |
| Sweep the repo for issues, ranked | scan |
| What breaks if I change this? | impact_of |
| How does a request flow end to end? | trace_path (HTML form → route → handler → ORM → SQL table) |
| What's dangerous to touch? | risk (git churn × centrality), find_chokepoints (cut vertices × blast radius) |
| Where's the code that does X / clones of this? | find_similar — by tokens, or mode="structure" for body-shape clone detection (renamed/reordered clones a text diff misses) |
| How do two builds differ? | graph_diff — call-level deltas plus body-shape changes (catches a data-flow bug that leaves the call graph identical) |
| Drill into one function | get_matrix(layer="call" | "statement" | "expression") — call graph → program-dependence graph → value-flow graph |
| Ground liveness in reality | ingest_trace (coverage.py JSON / LCOV / Go coverprofile) |
| Rebuild the index | reindex (admin; --precise adds jedi) |
…plus the eleven behavioural operations below.
The behavioural toolkit (runtime analysis)
The static graph describes structure. The behavioural toolkit measures what your test suite actually executes, and answers questions that cannot be answered by reading code — this is the part of stitchgraph that tells you things you don't already know.
It consumes one inert artifact: a per-test coverage matrix (which test executed which function). stitchgraph never runs your code — it generates a sandboxed capture kit and you run it in your own jail:
stitchgraph scaffold-coverage --db stitchgraph.db # writes Docker/shell/CI recipes
# run the generated kit (it runs YOUR tests in YOUR sandbox) → coverage_modes.json
stitchgraph find-modes --coverage coverage_modes.json --db stitchgraph.db
| Ask | Operation |
|---|---|
| How many independent behaviours does my suite exercise? What are they? | find_modes — POD/SVD of the coverage matrix: behavioural modes, intrinsic dimensionality, a minimal covering test set |
| Which tests should CI run for this change / this PR? | select_tests (runtime evidence fused with the static blast radius; accepts comma-separated changesets) |
| What code moves together with X? | co_change |
| What co-runs but has no static link? (hidden coupling) | find_coupling |
| Which live functions does no test execute? | find_gaps (fuses coverage with reachability: live-untested vs dead) |
| What order surfaces failures fastest? | test_order (greedy new-coverage-first; the prefix is a minimal cover) |
| Which tests are coverage-identical? | redundant_tests (review aid — parametrized tests share profiles legitimately; never auto-delete) |
| What's the always-on core? | find_core |
| Which tests do something nothing else does? | find_outlier_tests |
| Which files change often AND carry many behaviours? | runtime_risk (churn × behavioural centrality) |
| What gained/lost test exposure between two snapshots? | coverage_drift |
Dogfood example (this repo, research/14): 2,349 tests turn out to exercise 27
independent behaviours; 64 tests cover every executed function; the one
untested-dead function find_gaps reports is exactly the one find_stale flags
statically — and find_coupling located a real config↔envelope side-channel blind.
For LLM agents (MCP)
stitchgraph is MCP-native: every operation above is an MCP tool with the same name
and the same JSON envelope. Launch the server pointed at a built index (build it
first with reindex — the server refuses to answer from a missing or never-indexed
database rather than confidently reporting an empty codebase):
pip install 'stitchgraph[mcp,treesitter]'
stitchgraph reindex /path/to/project --db /path/to/stitchgraph.db
stitchgraph-mcp --db /path/to/stitchgraph.db # or env STITCHGRAPH_DB
Claude Desktop / Claude Code configuration:
{
"mcpServers": {
"stitchgraph": {
"command": "stitchgraph-mcp",
"args": ["--db", "/absolute/path/to/stitchgraph.db"]
}
}
}
Rules of engagement for agents
The full rule file — written to be dropped into an agent's context — is
AGENTS.md. The essentials:
- Query the graph before grepping.
orientfirst on unfamiliar code;impact_of <name>before editing anything;get_callers/get_calleesinstead of text search;trace_pathfor end-to-end flows. - Respect the envelope.
needs_review: truemeans "unreached by my analysis", not "proven dead" — verify dynamic dispatch, plugins, and framework callbacks before acting.confidenceandprovenance(extracted>inferred>ambiguous) tell you whether a result is a fact or a ranked guess. - Never delete on
find_stalealone. It is precision-biased and advisory by design; treat results as candidates to verify. - Use
scanfor triage, ordered byurgency(🔴 fix now / 🟠 look closer / 🟢 cleanup); a finding capped 🟢 withneeds_reviewrests on name-ambiguous edges and is likely a resolution artifact. - Prefer
select_testsover "run everything" when a coverage artifact exists — it returns the tests that actually executed the changed symbols. - Refusals are honest: a bare-name collision, a too-broad
get_matrixscope, or a missing index returns an explanation and a suggested next call, not a guess.
Trust model
- The envelope. Every answer:
ok,result,confidence(0–1),provenance(extracted= read from syntax;inferred= heuristic;ambiguous= several candidates),needs_review+ human-readable reasons, and for findings anurgency. Provenance caps urgency — a heuristic link can never shout RED. - The cardinal rule. Live code is never confidently flagged dead. Dozens of
per-language liveness signals (exports, framework callbacks, dunders/magic
methods, FFI/linker attributes, test conventions…) root the graph; ambiguity
widens edges rather than dropping them. The deliberate trade-offs are documented
— decision by decision — in
LIMITATIONS.md. - Read-only, local, private. No code leaves your machine; nothing executes; the only file written is the index (plus explicitly requested reports/kits).
- Verified. ~2,300 tests including differential oracles (streaming index == in-memory, incremental == full reindex, GraphBLAS == pure Python), per-language completeness batteries, and ground-truthing against ~47 real projects (Linux kernel core, WordPress, Magento, NestJS…) with zero crashes. Hostile inputs degrade to a smaller index, never a wrong confident answer.
Languages
| Depth | Languages |
|---|---|
Deep (stdlib ast; optional jedi --precise) |
Python 3.11+ |
| Full graph via tree-sitter (definitions, calls, imports/inheritance, tests, body matrix) | JavaScript, TypeScript/TSX, Go, Rust, C, C++, C#, Java, Ruby, PHP, Bash |
| Cross-language seams | Flask/FastAPI/Django/Express/Spring routes, HTML forms, JS fetch, events, SQL (sqlglot), SQLAlchemy/Django ORM — all converging in one graph, so trace_path crosses language boundaries |
Per-language support matrix: docs/LANGUAGES.md.
Scale
reindex streams the graph to SQLite in constant memory (auto-enabled for large
on-disk trees): a 4,300-file Magento module indexes in 269 MB peak instead of
3.2 GB, byte-identical output, pinned by a differential oracle. Query sweeps stream
their adjacency too — a 16M-edge graph (Home Assistant scale) is queried in ~2 GB.
Details: docs/V2_STREAMING_DESIGN.md.
Develop
pip install -e '.[all,dev]'
PYTHONPATH=src python -m pytest -q
CI runs the suite on Python 3.11/3.12 plus a no-extras job that guards the
stdlib-only core. Design: docs/design.md · capability map:
docs/OVERVIEW.md · status/roadmap: docs/STATUS.md
· release history: CHANGELOG.md and docs/RELEASE_NOTES_v*.md ·
review process: CONTRIBUTING.md, REVIEW_HISTORY.md.
MIT licensed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stitchgraph-3.27.1.tar.gz.
File metadata
- Download URL: stitchgraph-3.27.1.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af786ccbd7745c0ee92f1ebe2da3f5e4103bd97745f11dfbea6ff0a2229cf5d7
|
|
| MD5 |
b25f002365bde7ac1ce153c6ea269ac9
|
|
| BLAKE2b-256 |
b997ce9e8e8cf14b6be8e960b1fbb3e4ae30c894e627884f9e346a5ecd8ad3de
|
File details
Details for the file stitchgraph-3.27.1-py3-none-any.whl.
File metadata
- Download URL: stitchgraph-3.27.1-py3-none-any.whl
- Upload date:
- Size: 320.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbadd8ac4ab43eb2bdc2ed3a663642c1eae2023931b8d5a7475b16d125e5d954
|
|
| MD5 |
f0455dc314ee01ebcf38af8c77f4be0b
|
|
| BLAKE2b-256 |
de6a7ed28722e36c553daa5722cf8e5d7054f6bf823b05814e0061f0880ef7b2
|