Code graph extraction for LLM-assisted debugging
Project description
ContextAI
Code graph extraction for LLM-assisted debugging. Turn a codebase into a directed property graph so an LLM gets surgical, structured, bounded context — the broken node plus its neighbors — instead of the whole repository.
Why
LLMs fail on large codebases for three reasons:
- Too much context → noise, confusion, hallucination
- Too little context → blind spots, wrong fixes
- No structure → the model can't see how components relate
Bugs don't live in isolation — they live in the relationships between components. ContextAI makes those relationships explicit and traversable by representing every meaningful piece of code as a node and every relationship as an edge. When debugging, you feed the LLM only the broken node, its adjacent nodes, and the connecting edges.
What it does
source code ──▶ extraction pipeline ──▶ directed property graph ──▶ bounded LLM context
(AST · framework · (nodes + edges, (a node + its
patterns · runtime) fully typed schema) neighborhood)
ContextAI scans a Python project — .py modules and .ipynb notebooks — and emits a typed graph (graph.json) plus an interactive visualization (graph.html). Each node carries its signature, side effects, error-handling profile, and complexity; each edge carries its contract, criticality, and failure behavior.
Quickstart
Requirements: Python 3.11+
# install
pip install -e . # or: pip install -r requirements.txt
# extract a graph from any file or directory
contextai path/to/your/project/ # or: python3 run.py path/to/your/project/
# outputs (under out/, gitignored):
# out/graph.json — all nodes + edges
# out/index.html — dashboard: coverage · connectivity health · interactive graph
# out/graph.html — raw interactive graph (pyvis, self-contained)
Open out/index.html for the full picture — extraction coverage, connectivity health
(fragmentation, isolated nodes, sinks/sources), node-type distribution, and the live graph
with a code/location details panel, all in one self-contained file.
Run it against the bundled benchmark (the VS Code Flask tutorial):
python3 run.py python-sample-vscode-flask-tutorial/
Architecture
| Layer | Tools | Gets you | Status |
|---|---|---|---|
| 1. AST + types | Python ast (.py + .ipynb) |
functions, classes, imports, call sites, signatures, side effects | ✅ Implemented |
| 2. Framework conventions | custom parsers | route → handler, templates, static assets (Flask) | ✅ Flask; others planned |
| 2.5. Pattern matching | regex on source | DB tables, hardcoded URLs, cache ops, template refs | 🟡 Partial |
| 3. Runtime tracing | sys.settrace + asyncio hooks |
actual call chains, dynamic dispatch, async fan-out | ✅ Implemented |
| 4. LLM pass | Claude / GPT | semantic intent, implicit relationships | ⏳ Planned |
No single method captures everything: static analysis sees what code says, runtime tracing sees what it does. ContextAI merges both into one graph.
Pipeline
run.py
├─ Pass 1: walk files → emit NODES (ast_extractor, flask_convention_extractor)
└─ Pass 2: resolve IDs → emit EDGES (edge_extractor)
graph/extractors/runtime/ ← trace a running app and merge real call chains
├─ tracer.py sys.settrace + asyncio monkey-patches
├─ call_log.py captured calls → call_log.json
├─ script_runner.py drive a target script/entry point under the tracer
└─ edge_injector.py merge runtime calls into the static graph
The graph schema
schema.py is the source of truth (Pydantic v2).
Nodes carry: id, type, name, location, code, signature (typed inputs/outputs), side_effects, error_handling, and metadata (complexity, test coverage, staleness).
Edges carry: id, type, from, to, direction, contract (input/output shape), criticality, on_failure (retry / default / throw / circuit-break), and performance.
Node types
- Boundary:
API_ENDPOINT,MESSAGE_CONSUMER,CRON_JOB - Logic:
FUNCTION,CLASS,MIDDLEWARE,ROUTE_HANDLER - Data:
SCHEMA,MODEL,DTO,DATABASE,TABLE,COLLECTION - Infra:
MESSAGE_QUEUE,FILE_STORAGE,EXTERNAL_LIBRARY - Scaffolding:
MODULE_INIT,ENTRY_POINT,FILE,TEMPLATE,STATIC_ASSET
Edge types
| Category | Edges |
|---|---|
| Call | CALLS, CALLS_ASYNC, DELEGATES_TO |
| Dependency | IMPORTS, IMPORTS_SIDE_EFFECT, INHERITS, IMPLEMENTS, INSTANTIATES, INJECTS |
| Data | READS, WRITES, VALIDATES, TRANSFORMS, MAPS_TO, RETURNS |
| Communication | HANDLES, GUARDS, PUBLISHES_TO, SUBSCRIBES_TO, CALLS_EXTERNAL, RENDERS, SERVES_STATIC |
| App wiring | USES_APP_INSTANCE |
Analysis tools
All default to graph.json in the current directory.
python3 tools/graph_connectivity.py # health score, isolated nodes, islands
python3 tools/coverage_from_graph.py # % of source lines covered by nodes
python3 tools/graph_duplicates.py # duplicate / overlapping node ranges
python3 tools/diff_graph.py a.json b.json # diff two graphs
Runtime tracing
python3 tools/run_with_tracing.py \
--target your_app/main.py \
--project-root your_app/ \
--build-static \
--output graph_runtime.json
Runs your app under the tracer, then merges observed call chains into the static graph (confirming static edges, filling gaps, and adding runtime-only edges).
Public API
graph/api.py is the only surface the MCP server (and any other client) should import — never reach into extractors or GraphStore directly.
from graph.api import (
build_graph, load_graph, find_node, get_context, list_gaps, get_edge_path,
run_trace, merge_trace, start_trace, stop_trace,
)
Querying. get_context(store, node_id, depth=2, direction="in") returns a bounded subgraph around a node. It is incoming-biased by default: deep predecessors (who calls this — the blast radius), one shallow successor hop, and a successor pull around gap nodes. Pass direction="out" / "both" to change the bias.
Runtime tracing has three capture modes, all converging on one merge:
| Mode | Entry point | Use it for |
|---|---|---|
| One-shot script / IDE run | run_trace(target, project_root, …) |
capture + merge a single script or entry point in one call |
| Long-running session | start_trace(project_root) … stop_trace(project_root, base_graph, output) |
a server or worker traced across many requests without restarting — every call in between is unioned into one capture |
| Per-request web | TracingMiddleware / AsyncTracingMiddleware (tools/trace_middleware.py) |
trace one request at a time, triggered by an X-Trace: 1 header |
All three feed merge_trace(base_graph, call_log, project_root, output) — the single seam that folds a runtime call log onto a base graph. The base is a parameter: pass the static graph to merge a single action, or a prior runtime graph to accumulate a sequence of actions (call counts sum, edges union). The static graph is never mutated — every merge writes a fresh overlay.
Testing
pip install pytest
pytest -q
The suite (153 tests) is built on an inductive strategy: every atomic extraction pattern — structural, web/API, data access, messaging, signatures, data flow, async runtime — has a minimal fixture and an exact-count assertion. If the extractor handles every base case, it handles their combinations.
tests/
test_static_induction.py structural · web · data · messaging · signatures · data flow
test_phase2_edges.py call resolution, dynamic dispatch, super(), properties
test_phase3_runtime.py tracer capture + edge-injector merge + end-to-end
test_phase3_http.py HTTP / async routing patterns
test_runtime_api.py public API: direction-aware context, merge/run/session tracing
test_notebook_extractor.py .ipynb flattening + node/edge extraction across cells
test_dashboard.py metrics builder + self-contained dashboard generation
fixtures/ minimal atomic patterns per test
Project structure
schema.py NodeSchema + EdgeSchema (Pydantic, source of truth)
run.py entry point: extract → store → visualize (contextai cli)
graph/
api.py public API consumed by the MCP server
extractors/
ast_extractor.py Python AST → nodes (functions, classes, schemas, …)
edge_extractor.py all edge types
notebook_extractor.py .ipynb → flatten code cells → reuse AST pipeline
flask_convention_extractor.py templates + static assets
runtime/ sys.settrace tracer + call log + script runner + edge injector
store/graph_store.py NetworkX graph + JSON persistence + direction-aware neighbor traversal
visualizer/visualizer.py pyvis HTML output (out/graph.html)
tools/ connectivity, coverage, duplicates, diff, tracing
benchmarks/flask-tutorial/ hand-authored ground-truth graph (diff target)
dashboard/ self-contained dashboard → out/index.html
metrics.py reuses the coverage + connectivity tools → one payload
dashboard.py / template.html embed graph + metrics into a single HTML file
out/ generated artifacts (gitignored)
docs/ design + planning notes
tests/ inductive test suite + fixtures
Roadmap
- Static extraction (AST) — nodes, edges, signatures, side effects
- Flask framework conventions (routes, templates, static assets)
- Runtime tracing (sync + async call chains)
- Inductive test suite (153 tests)
- Jupyter
.ipynbnotebook extraction — code cells → AST pipeline, with cell-aware locations - MCP server — expose the graph to LLM clients as a tool (
docs/MCP_SERVER_PLAN.md) - LLM integration — neighborhood-context retrieval for debugging
- More frameworks (Django, FastAPI), git/version metadata, derived impact edges (
AFFECTS,DEPENDS_ON,TRIGGERS) - Multi-language extraction (JS/TS →
UI_COMPONENT)
See docs/ for design and planning notes.
Status
Alpha. The static extractor and runtime tracer work and are covered by tests. The LLM/MCP consumption layer — the part that turns the graph into better debugging answers — is in active development.
License
No license has been chosen yet. Until one is added, all rights are reserved by the author.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file contextai_graph-0.2.0.tar.gz.
File metadata
- Download URL: contextai_graph-0.2.0.tar.gz
- Upload date:
- Size: 90.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a51da3df5e4dd366d312a2adbecacad20234d8e439bf252b58971c3a0084914d
|
|
| MD5 |
7ddf0c85c2f5c5064239f41738937476
|
|
| BLAKE2b-256 |
51a96c565eb52222d0dcf39d4638f8ed76c82f5d409c125d25d0684b8b18fc16
|
File details
Details for the file contextai_graph-0.2.0-py3-none-any.whl.
File metadata
- Download URL: contextai_graph-0.2.0-py3-none-any.whl
- Upload date:
- Size: 77.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3f9a257656cc2161318eaf99a0241a6d5d2dcaabe494d619c297ff9551a6c2c
|
|
| MD5 |
edf641eb86ac5215ca1441b4d9fc26a8
|
|
| BLAKE2b-256 |
380d3b3e0be68719aa6ff3dc18bcc77394fda16f52140c832ea25bc3af80fe65
|