Call-graph-aware code context retrieval for AI coding agents (MCP server + CLI)

These details have not been verified by PyPI

Project links

Project description

promptify-cmax

Call-graph-aware code context retrieval for AI coding agents.

When you ask Claude Code or Cursor to fix a specific function, the agent typically falls back to grep and pulls every file that mentions the symbol — specs, plans, ADRs, unrelated definitions that happen to share a name. The model pays attention tax on all of it.

promptify-cmax returns the files most likely to need editing, ranked by distance in the call graph rather than surface name match. It exposes both an MCP server (drop-in for Claude Code, Cursor, Continue) and a CLI.

When to use this (vs. semantic retrieval)

The MCP-server space already has good tools for discovery — open-ended "how does auth work in this codebase?" questions. Tools like zilliztech/claude-context use dense embeddings to find code that's semantically close to your query. That's the right call when you don't yet know the names of the symbols you're looking for.

promptify-cmax solves the complementary problem: editing tasks where you already know the symbol. "Fix threshold_for_complexity," "why does run_query break when I change Index.upsert?" — these have a specific entry-point identifier, and the right files to read are the ones structurally connected to it (callers, callees, transitive). Embeddings can't see structural reachability; they retrieve based on token similarity, which lets unrelated namesakes contaminate results. We do FQN-aware call-graph BFS, so two helper() functions in different files are different graph nodes.

The two approaches are orthogonal and can run side-by-side as separate MCP tools. A capable agent will pick the right one for the task.

If your task looks like…	Use
"How does X work?" / unfamiliar codebase exploration	semantic retrieval (e.g. `claude-context`)
"Fix `func_name`" / "why does Y change when I edit Z?" / known-symbol editing	promptify-cmax
Pattern-matching across the codebase ("find all calls to deprecated API")	`ast-grep` MCP

Why call-graph and not embeddings, for editing tasks

On SWE-bench-Verified Python bug-fix tasks at a 30 000-token budget, structural retrieval surfaces the file the agent needs to edit at a +24.6 pp higher rate than substring grep — 41.0 % vs 16.4 % — robust across three pre-registered spikes (v0.4 / v5 / v6) at n=250, n=250, n=127. The v6 verdict is a clean PASS on a 127-instance sample fully disjoint from prior measurement runs:

Budget	grep finds patch	structural finds patch	Δ
5 000 tokens	2.8 %	16.6 %	+13.8 pp
30 000 tokens	16.4 %	41.0 %	+24.6 pp
100 000 tokens	39.1 %	58.6 %	+19.5 pp

Statistics: paired Wilcoxon p = 1.2 × 10⁻⁵, BCa-99 lower bound +10.9 pp, McNemar p = 1.9 × 10⁻⁵, JZS Bayes factor ≈ 4 800, multiverse 5/5 budgets directional. Cross-spike effect-size: +0.170 → +0.213 → +0.246 (consistent across three independent samples).

Audit trail: the public claim above is lifted verbatim from SPIKE-PCM-BENCH-FULLDISJOINT-V6 VERDICT.md §"Construct ceiling". The full v0.4 → v5 → v6 spike chain — including a PARITY verdict (paired-median degenerate on binary outcomes) and a FLAGGED-PASS verdict (overlap > 30 % auto-downgrade) — is preserved in the research-spikes dossier. The discipline (ADR-0025) gates every public-surface number on a closed-go spike's verdict.

What the bench measures: did the agent's structural retrieval surface the file the gold patch actually edits, anywhere in its ranked list, within a 30 000-token budget? It does NOT measure end-to-end editing success (whether the agent ultimately produces a correct fix); SWE-bench's evaluation harness is out of scope. The v3-era "49× lower token cost" framing was empirically falsified at n=109 and is retired.

The structural argument independent of the number: a senior engineer fixing a bug doesn't grep for the function name across the repo and read every match. They ask "what calls this, and what does this call?" That's a graph traversal, not a similarity ranking.

Status

v0.3 (general availability) — Python and TypeScript indexing, FQN-aware call resolution, MCP server, ~33 tests. Wedge claim audited via the v0.4 → v5 → v6 spike chain (see "Why call-graph and not embeddings, for editing tasks" above). License: Apache-2.0. Go / Rust / Java / C# planned for Pro tier.

Install

pip install promptify-cmax

Then index your project and wire it into Claude Code / Cursor / Continue. Five-minute walkthrough with copy-pasteable MCP config snippets and troubleshooting: QUICKSTART.md.

What it exposes

CLI:

promptify-cmax index --project-root <dir> — build / incrementally update the structural index (one-time per repo, then automatic-on-change)
promptify-cmax query --project-root <dir> "<task>" — return ranked files for a task description
promptify-cmax serve --project-root <dir> — run as an MCP server over stdio

MCP tools (when run as serve):

structural_context(task, top_k=5) — rank files by call-graph distance from the task's identifiers
reindex() — rebuild after large code changes

How it works

Index (one-time per repo, then incremental on file change): tree-sitter walks every Python and TypeScript source file, extracts function definitions, intra-function call sites, and module-level imports; persists everything to a single SQLite file at .promptify/code-index.db.
Resolve (query time): given a natural-language task, extract candidate identifiers (backtick / CamelCase / snake_case / dotted paths) and intersect with the symbols actually in the index.
BFS (query time): walk the call graph two hops in both directions; resolve each call edge to a specific (file, function) tuple via the caller's import bindings and same-file scope, so two functions named helper in different files never collapse into one node.
Rank: group reached nodes by file, sort by (distance ASC, affected-function-count DESC), return the top-k.

The discipline that makes this useful: fully-qualified-name resolution, not bare-name matching. A naive call graph treats every def main(): ... in the repo as the same node — typically 100+ collisions in any non-trivial Python project. We resolve through imports, so cross-file false positives don't enter the BFS frontier.

Roadmap

Python + TypeScript indexing (v0.1)
FQN-aware call resolution
MCP server, CLI
Go, Rust, Java, C# (Pro)
Hosted multi-repo index (Pro)
PR-bot / CI integration (Team)
VSCode + JetBrains extensions

Pro / Team

This package is the open-source core. Promptify is building a hosted layer for teams (multi-repo indexing that survives laptop churn, additional language support, token-savings analytics, editor extensions, SSO/SAML, CI integration). Pricing and signup haven't shipped yet — watch the repo or open an issue if you'd like a heads-up when the hosted tier launches.

Contributing

See CONTRIBUTING.md. Issues and PRs welcome.

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

May 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptify_cmax-0.3.0.tar.gz (30.2 kB view details)

Uploaded May 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

promptify_cmax-0.3.0-py3-none-any.whl (24.0 kB view details)

Uploaded May 9, 2026 Python 3

File details

Details for the file promptify_cmax-0.3.0.tar.gz.

File metadata

Download URL: promptify_cmax-0.3.0.tar.gz
Upload date: May 9, 2026
Size: 30.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for promptify_cmax-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`8083a30e4435b57153003bc4b284320f2904c4532270d4f0f48cf13667c6ab6f`
MD5	`3e436f15149c656f3b2dc07c88483db4`
BLAKE2b-256	`27bbb720302580f307918d80df8037ffa0a655da299edeaa1063032a9e575101`

See more details on using hashes here.

File details

Details for the file promptify_cmax-0.3.0-py3-none-any.whl.

File metadata

Download URL: promptify_cmax-0.3.0-py3-none-any.whl
Upload date: May 9, 2026
Size: 24.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for promptify_cmax-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b3f300df44e567f54b10ca835d3fb8cbe0d1639f6a4a8369a29ff520e730da99`
MD5	`5550d4cd52a22ee9471cd2ceb0c7bc0b`
BLAKE2b-256	`3b6a42a10038ed917b44a2071b10abdacaca783c77128cb9bfdf5360aa8e0cec`

See more details on using hashes here.

promptify-cmax 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

promptify-cmax

When to use this (vs. semantic retrieval)

Why call-graph and not embeddings, for editing tasks

Status

Install

What it exposes

How it works

Roadmap

Pro / Team

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes