Local-first, MCP-native code-intelligence: dead code, holes, orientation, impact — ranked by what's live, every answer carrying confidence.

These details have not been verified by PyPI

Project links

Project description

stitchgraph

Local-first code intelligence for humans and LLM agents. Point stitchgraph at a codebase and ask it plain questions — what's dead? what breaks if I change this? how does a request flow end to end? which tests should I run? It indexes 12 languages into a single SQLite graph on your machine, answers through three identical surfaces (Python library, CLI, MCP server), and attaches a confidence, a provenance, and a reason to double-check to every answer, so you always know how much to trust it.

Two design commitments make it different:

It never guesses confidently. Every result rides a universal envelope (confidence / provenance / needs_review / urgency), and the cardinal rule — live code is never confidently flagged dead — biases every liveness decision toward precision. When stitchgraph isn't sure, it says so and tells you why.
It measures what code does, not just what it says. Beyond the static graph, the behavioural toolkit decomposes a per-test coverage matrix (POD/SVD) into your suite's runtime behavioural modes — how many independent behaviours you actually test, which 6% of tests cover everything, which functions co-run with no static link. These are answers no amount of reading source can produce.

Everything runs offline against a plain SQLite file. stitchgraph is read-only on your code — it writes only to its own index, never executes your project, and every finding is advisory: ranked options for a human or agent to act on.

Install
Five-minute quickstart
The operations
The behavioural toolkit (runtime analysis)
For LLM agents (MCP)
Trust model
Languages
Scale
Develop

Install

pip install stitchgraph              # library only (stdlib core, Python analysis)
pip install 'stitchgraph[cli]'       # + the `stitchgraph` command
pip install 'stitchgraph[mcp]'       # + the `stitchgraph-mcp` server for agents
pip install 'stitchgraph[all]'       # everything below

Extra	Unlocks
`cli`	The `stitchgraph` command (Typer)
`mcp`	The `stitchgraph-mcp` server for LLM agents (FastMCP)
`treesitter`	The other 11 languages, with bundled offline grammars (CI/air-gap safe)
`treesitter-download`	Same, but fetches the newest grammars on first use
`precise`	jedi type-grade Python resolution (`reindex --precise`)
`resolve`	SQL statement resolution (sqlglot) — powers full-stack traces into tables
`algebra`	GraphBLAS-accelerated whole-graph sweeps (pure-Python fallback built in)
`spectral`	scipy sparse solvers — uncaps `find_subsystems` / `find_modes` on large repos

Run stitchgraph doctor (add --strict in CI) to check which grammars load.

Five-minute quickstart

Index once, then ask questions. The index is a single SQLite file; re-run reindex after large changes (or leave stitchgraph watch . running).

cd your-project
stitchgraph reindex . --db stitchgraph.db     # build the graph (12 languages, one pass)

stitchgraph orient --db stitchgraph.db        # new here? counts, entry points, top hubs
stitchgraph find-stale --db stitchgraph.db    # likely-dead code, precision-biased
stitchgraph scan --db stitchgraph.db          # ranked issues: stubs, holes, cycles, god objects
stitchgraph impact-of UserService --db stitchgraph.db   # blast radius + which tests to run
stitchgraph trace-path loadUsers users --db stitchgraph.db  # full-stack: JS → route → SQL table
stitchgraph report --db stitchgraph.db        # one Markdown report of all of the above

Every command takes --json for the raw envelope (machine-readable, full payload — text output truncates long lists). Exit codes: 0 clean, 1 RED findings exist, 2 operational failure (missing/unopenable --db) — safe to gate CI on.

The same operations, as a library:

import stitchgraph as sg

with sg.Store("stitchgraph.db") as store:
    sg.reindex(store, ".")
    print(sg.find_stale(store))       # every result is a Result envelope:
    print(sg.impact_of(store, "UserService"))  # .ok .result .confidence .needs_review

The operations

Thirty-one operations, one question each. All advisory, all read-only, all carrying the envelope.

Ask	Operation(s)
Where is X, who calls it, what does it call?	`find_symbol`, `get_callers`, `get_callees`
I'm new here — orient me	`orient`, `summarize_subsystem`, `find_subsystems`
What's dead? What's referenced but missing?	`find_stale`, `find_holes`
Sweep the repo for issues, ranked	`scan`
What breaks if I change this?	`impact_of`
How does a request flow end to end?	`trace_path` (HTML form → route → handler → ORM → SQL table)
What's dangerous to touch?	`risk` (git churn × centrality), `find_chokepoints` (cut vertices × blast radius)
Where's the code that does X / clones of this?	`find_similar` — by tokens, or `mode="structure"` for body-shape clone detection (renamed/reordered clones a text diff misses)
How do two builds differ?	`graph_diff` — call-level deltas plus body-shape changes (catches a data-flow bug that leaves the call graph identical)
Drill into one function	`get_matrix(layer="call" \| "statement" \| "expression")` — call graph → program-dependence graph → value-flow graph
Ground liveness in reality	`ingest_trace` (coverage.py JSON / LCOV / Go coverprofile)
Rebuild the index	`reindex` (admin; `--precise` adds jedi)

…plus the eleven behavioural operations below.

The behavioural toolkit (runtime analysis)

The static graph describes structure. The behavioural toolkit measures what your test suite actually executes, and answers questions that cannot be answered by reading code — this is the part of stitchgraph that tells you things you don't already know.

It consumes one inert artifact: a per-test coverage matrix (which test executed which function). stitchgraph never runs your code — it generates a sandboxed capture kit and you run it in your own jail:

stitchgraph scaffold-coverage --db stitchgraph.db     # writes Docker/shell/CI recipes
# run the generated kit (it runs YOUR tests in YOUR sandbox) → coverage_modes.json

stitchgraph find-modes --coverage coverage_modes.json --db stitchgraph.db

Ask	Operation
How many independent behaviours does my suite exercise? What are they?	`find_modes` — POD/SVD of the coverage matrix: behavioural modes, intrinsic dimensionality, a minimal covering test set
Which tests should CI run for this change / this PR?	`select_tests` (runtime evidence fused with the static blast radius; accepts comma-separated changesets)
What code moves together with X?	`co_change`
What co-runs but has no static link? (hidden coupling)	`find_coupling`
Which live functions does no test execute?	`find_gaps` (fuses coverage with reachability: live-untested vs dead)
What order surfaces failures fastest?	`test_order` (greedy new-coverage-first; the prefix is a minimal cover)
Which tests are coverage-identical?	`redundant_tests` (review aid — parametrized tests share profiles legitimately; never auto-delete)
What's the always-on core?	`find_core`
Which tests do something nothing else does?	`find_outlier_tests`
Which files change often AND carry many behaviours?	`runtime_risk` (churn × behavioural centrality)
What gained/lost test exposure between two snapshots?	`coverage_drift`

Dogfood example (this repo, research/14): 2,349 tests turn out to exercise 27 independent behaviours; 64 tests cover every executed function; the one untested-dead function find_gaps reports is exactly the one find_stale flags statically — and find_coupling located a real config↔envelope side-channel blind.

For LLM agents (MCP)

stitchgraph is MCP-native: every operation above is an MCP tool with the same name and the same JSON envelope. Launch the server pointed at a built index (build it first with reindex — the server refuses to answer from a missing or never-indexed database rather than confidently reporting an empty codebase):

pip install 'stitchgraph[mcp,treesitter]'
stitchgraph reindex /path/to/project --db /path/to/stitchgraph.db
stitchgraph-mcp --db /path/to/stitchgraph.db     # or env STITCHGRAPH_DB

Claude Desktop / Claude Code configuration:

{
  "mcpServers": {
    "stitchgraph": {
      "command": "stitchgraph-mcp",
      "args": ["--db", "/absolute/path/to/stitchgraph.db"]
    }
  }
}

Rules of engagement for agents

The full rule file — written to be dropped into an agent's context — is AGENTS.md. The essentials:

Query the graph before grepping. orient first on unfamiliar code; impact_of <name> before editing anything; get_callers/get_callees instead of text search; trace_path for end-to-end flows.
Respect the envelope. needs_review: true means "unreached by my analysis", not "proven dead" — verify dynamic dispatch, plugins, and framework callbacks before acting. confidence and provenance (extracted > inferred > ambiguous) tell you whether a result is a fact or a ranked guess.
Never delete on find_stale alone. It is precision-biased and advisory by design; treat results as candidates to verify.
Use scan for triage, ordered by urgency (🔴 fix now / 🟠 look closer / 🟢 cleanup); a finding capped 🟢 with needs_review rests on name-ambiguous edges and is likely a resolution artifact.
Prefer select_tests over "run everything" when a coverage artifact exists — it returns the tests that actually executed the changed symbols.
Refusals are honest: a bare-name collision, a too-broad get_matrix scope, or a missing index returns an explanation and a suggested next call, not a guess.

Trust model

The envelope. Every answer: ok, result, confidence (0–1), provenance (extracted = read from syntax; inferred = heuristic; ambiguous = several candidates), needs_review + human-readable reasons, and for findings an urgency. Provenance caps urgency — a heuristic link can never shout RED.
The cardinal rule. Live code is never confidently flagged dead. Dozens of per-language liveness signals (exports, framework callbacks, dunders/magic methods, FFI/linker attributes, test conventions…) root the graph; ambiguity widens edges rather than dropping them. The deliberate trade-offs are documented — decision by decision — in LIMITATIONS.md.
Read-only, local, private. No code leaves your machine; nothing executes; the only file written is the index (plus explicitly requested reports/kits).
Verified. ~2,300 tests including differential oracles (streaming index == in-memory, incremental == full reindex, GraphBLAS == pure Python), per-language completeness batteries, and ground-truthing against ~47 real projects (Linux kernel core, WordPress, Magento, NestJS…) with zero crashes. Hostile inputs degrade to a smaller index, never a wrong confident answer.

Languages

Depth	Languages
Deep (stdlib `ast`; optional jedi `--precise`)	Python 3.11+
Full graph via tree-sitter (definitions, calls, imports/inheritance, tests, body matrix)	JavaScript, TypeScript/TSX, Go, Rust, C, C++, C#, Java, Ruby, PHP, Bash
Cross-language seams	Flask/FastAPI/Django/Express/Spring routes, HTML forms, JS `fetch`, events, SQL (sqlglot), SQLAlchemy/Django ORM — all converging in one graph, so `trace_path` crosses language boundaries

Per-language support matrix: docs/LANGUAGES.md.

Scale

reindex streams the graph to SQLite in constant memory (auto-enabled for large on-disk trees): a 4,300-file Magento module indexes in 269 MB peak instead of 3.2 GB, byte-identical output, pinned by a differential oracle. Query sweeps stream their adjacency too — a 16M-edge graph (Home Assistant scale) is queried in ~2 GB. Details: docs/V2_STREAMING_DESIGN.md.

Develop

pip install -e '.[all,dev]'
PYTHONPATH=src python -m pytest -q

CI runs the suite on Python 3.11/3.12 plus a no-extras job that guards the stdlib-only core. Design: docs/design.md · capability map: docs/OVERVIEW.md · status/roadmap: docs/STATUS.md · release history: CHANGELOG.md and docs/RELEASE_NOTES_v*.md · review process: CONTRIBUTING.md, REVIEW_HISTORY.md.

MIT licensed.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

3.27.1

Jul 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stitchgraph-3.27.1.tar.gz (1.0 MB view details)

Uploaded Jul 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stitchgraph-3.27.1-py3-none-any.whl (320.4 kB view details)

Uploaded Jul 3, 2026 Python 3

File details

Details for the file stitchgraph-3.27.1.tar.gz.

File metadata

Download URL: stitchgraph-3.27.1.tar.gz
Upload date: Jul 3, 2026
Size: 1.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for stitchgraph-3.27.1.tar.gz
Algorithm	Hash digest
SHA256	`af786ccbd7745c0ee92f1ebe2da3f5e4103bd97745f11dfbea6ff0a2229cf5d7`
MD5	`b25f002365bde7ac1ce153c6ea269ac9`
BLAKE2b-256	`b997ce9e8e8cf14b6be8e960b1fbb3e4ae30c894e627884f9e346a5ecd8ad3de`

See more details on using hashes here.

File details

Details for the file stitchgraph-3.27.1-py3-none-any.whl.

File metadata

Download URL: stitchgraph-3.27.1-py3-none-any.whl
Upload date: Jul 3, 2026
Size: 320.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for stitchgraph-3.27.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dbadd8ac4ab43eb2bdc2ed3a663642c1eae2023931b8d5a7475b16d125e5d954`
MD5	`f0455dc314ee01ebcf38af8c77f4be0b`
BLAKE2b-256	`de6a7ed28722e36c553daa5722cf8e5d7054f6bf823b05814e0061f0880ef7b2`

See more details on using hashes here.

stitchgraph 3.27.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

stitchgraph

Contents

Install

Five-minute quickstart

The operations

The behavioural toolkit (runtime analysis)

For LLM agents (MCP)

Rules of engagement for agents

Trust model

Languages

Scale

Develop

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes