Apex-relevance subgraph retrieval for AI agents. Feed your LLM the peak of your knowledge graph, sized to a token budget.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

alfonsomayoral

These details have not been verified by PyPI

Project description

Graphex

Apex-relevance subgraph retrieval for AI agents.

Feed your LLM the peak of your knowledge graph — sized to a token budget.

Python License

Knowledge graphs grow large. When an agent needs context about one corner of a codebase, dumping the whole graph into the prompt wastes tokens and money — and buries the relevant nodes in noise. Graphex scores every node against your query and returns the most relevant, connected subgraph that fits within a token budget, ready to paste into a prompt or serve over MCP.

graphex index .                            # build a graph from your code (no LLM)
graphex "how does auth work" --budget 4000 # retrieve the apex subgraph
graphex serve                              # expose it to agents over MCP

Graphex reads the graphs produced by graphify and uses the rich signals graphify emits — edge weights, confidence, hyperedges, communities, and god nodes — that simpler tools throw away.

Install

uv tool install apexgraph            # or: pipx install apexgraph
# optional extras:
uv tool install "apexgraph[local]"   # offline semantic recall (model2vec)
uv tool install "apexgraph[ts]"      # better TypeScript indexing (tree-sitter)
uv tool install "apexgraph[dense]"   # cloud embeddings (OpenAI / Voyage AI)

The PyPI distribution is apexgraph; the command and import name are graphex. Requires Python 3.12+.

How it works

A five-stage pipeline, each stage a single-responsibility module:

  load ─▶ score ─▶ select ─▶ inject ─▶ render
   │        │         │         │         │
 multi-   BM25 →    cost-aware  source-   markdown /
 format   PPR +     MMR under   code      json / yaml
 loader   prior     budget      bodies
                        ▲
   index ───────────────┘   build a graph straight from code (no graphify)

Relevance is one principled number, not a hand-tuned mix. BM25 finds the nodes the query is literally about; those seed a Personalized PageRank walk that spreads relevance across the weighted graph (edge weight × confidence, plus hyperedge cliques); a light importance/god-node prior nudges genuinely central entities up. The query-independent half — global PageRank, the BM25 inverted index — is precomputed once and cached, invalidated by content hash, so a query is just a lookup plus one walk.

Selection is a budgeted knapsack, solved as one. Picking the highest-value set of nodes under a token ceiling is the 0/1 knapsack problem. Graphex selects by marginal value per token and shapes the result with two terms — an MMR penalty so it doesn't say the same thing twice, and a connectivity bonus so the result is a coherent connected subgraph, not a bag of redundant islands. An exact DP-knapsack mode is available for benchmarking the value ceiling.

Token accounting is honest. A node's cost is the size of its final rendered form, including any injected source code — so tokens_used never lies and the output never overflows the budget you asked for.

Semantic recall, optionally offline. By default retrieval is lexical (BM25, with stemming). Add --backend local for offline embeddings (model2vec, no API key, no network) so a query finds what it's about even with no shared tokens — "authorization gate" surfaces the auth code. The lexical and semantic rankings are fused with Reciprocal Rank Fusion. Cloud embeddings (openai, voyage) are also available behind the [dense] extra.

Usage

# Index a project into a graphify-compatible graph.json (Python / TS / Go)
graphex index ./src -o graph.json
graphex index ./src --incremental          # re-index only changed files

# Query (any unrecognised first arg routes here)
graphex "session token validation" -b 2000
graphex "authorization gate" --backend local # offline semantic recall (no shared tokens needed)
graphex "auth flow" --explain               # per-node BM25 / PPR / prior breakdown
graphex "auth flow" --inject-code           # include real function bodies, still in budget
graphex "auth flow" --connected             # stitch toward a connected subgraph (best-effort)
graphex "auth flow" --viz                   # interactive force-directed HTML

# Inspect (node ids come from your indexed graph; these match examples/)
graphex stats -g examples/sample_graph.json
graphex explain auth_service_login -g examples/sample_graph.json
graphex path auth_service auth_service_login -g examples/sample_graph.json

# Export a context block to paste into a system prompt / CLAUDE.md
graphex export "auth flow" -f claudemd -o CONTEXT.md

# Measure quality honestly (recall@budget, not just tokens saved)
graphex benchmark -q "auth flow" -q "db pooling" -b 1000 -b 4000

# Compare two graph versions and see the change impact
graphex diff old.json new.json --budget 2000

See examples/ for a full walkthrough on a sample project.

MCP server

Graphex speaks the Model Context Protocol over stdio (stdlib only, no SDK):

graphex serve --graph graph.json

It exposes four tools: graphex_query, graphex_explain, graphex_path, graphex_stats. Register it with Claude Code:

claude mcp add graphex -- graphex serve --graph /abs/path/to/graph.json

Honest benchmarking

"Tokens saved" is a vanity metric — a tool that returns nothing saves 100%. Graphex reports recall@budget alongside it: how much of the relevant set the budgeted subgraph actually captures. High savings with low recall means under-retrieval, and the benchmark makes that trade-off visible.

A reproducible head-to-head against slurp lives in bench/. The honest takeaway: slurp posts higher raw recall by padding the budget with low-relevance nodes (≈8% precision — most of what it returns is off-topic), while Graphex is 4–10× more precise under budget, and its local backend recovers relevant nodes on semantic queries where lexical retrieval (slurp's TF-IDF and Graphex's own BM25) scores zero.

Development

uv sync
uv run pytest          # test suite
uv run ruff check .    # lint
uv run black .         # format

License

MIT © Alfonso Mayoral

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

alfonsomayoral

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.0

Jun 17, 2026

This version

0.2.0

Jun 16, 2026

0.1.0

Jun 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apexgraph-0.2.0.tar.gz (278.2 kB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

apexgraph-0.2.0-py3-none-any.whl (86.5 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file apexgraph-0.2.0.tar.gz.

File metadata

Download URL: apexgraph-0.2.0.tar.gz
Upload date: Jun 16, 2026
Size: 278.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for apexgraph-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`1a937406a8c213869fcc8f8c1db61de48134eefd7d7c2941cc9d0c42c2b6509c`
MD5	`18d7d8138c0cd5f99696e200048267ab`
BLAKE2b-256	`594459f0fb7189b8a6f04b926926a13ccd4cee8227351f31cfdcc8bd1d2abc2a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for apexgraph-0.2.0.tar.gz:

Publisher: publish.yml on alfonsomayoral/graphex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: apexgraph-0.2.0.tar.gz
- Subject digest: 1a937406a8c213869fcc8f8c1db61de48134eefd7d7c2941cc9d0c42c2b6509c
- Sigstore transparency entry: 1837599556
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: alfonsomayoral/graphex@d530e8c46868adfae81bcc45c2735bfadd44c38d
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/alfonsomayoral
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d530e8c46868adfae81bcc45c2735bfadd44c38d
- Trigger Event: release

File details

Details for the file apexgraph-0.2.0-py3-none-any.whl.

File metadata

Download URL: apexgraph-0.2.0-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 86.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for apexgraph-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c64a13ff879bd8ba37c5c52e7e156216577ecdb3d5c3c101661d3c66b82353f1`
MD5	`41c4bc15f23438d6812532de9d9f4f3a`
BLAKE2b-256	`4dcc03f51e0e34b5edc6e903b104bf5cea03db048aad65406aff4a9973e54a35`

See more details on using hashes here.

Provenance

The following attestation bundles were made for apexgraph-0.2.0-py3-none-any.whl:

Publisher: publish.yml on alfonsomayoral/graphex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: apexgraph-0.2.0-py3-none-any.whl
- Subject digest: c64a13ff879bd8ba37c5c52e7e156216577ecdb3d5c3c101661d3c66b82353f1
- Sigstore transparency entry: 1837599656
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: alfonsomayoral/graphex@d530e8c46868adfae81bcc45c2735bfadd44c38d
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/alfonsomayoral
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d530e8c46868adfae81bcc45c2735bfadd44c38d
- Trigger Event: release

apexgraph 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Graphex

Install

How it works

Usage

MCP server

Honest benchmarking

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance