Skip to main content

Deterministic codebase context retrieval for LLMs

Project description

llm-diet

PyPI version Python 3.11+ License: MIT Downloads

Give Claude the right context upfront. Fewer turns, faster answers, lower cost.

Deterministic context retrieval for AI coding tools. Parses your repo into a call graph, scores every function against your query, and injects the top matches — before Claude starts reasoning. No embeddings, no vector DB, no LLM calls in the retrieval path.

The Problem

Every Claude Code session starts blind. Claude explores your entire codebase before answering — reading files, listing directories, running commands. That exploration costs tokens and time.

Without llm-diet:
  Your prompt → Claude explores codebase → finds relevant code → answers
  Cost: $0.19 for a simple bug fix session

With llm-diet:
  Your prompt + injected context → Claude answers faster
  Cost: $0.035 for the same depth of answer

Real test on a 1,240-node project: 5x cheaper per session.

Benchmark

This repo (185 nodes, 46k tokens)

Query Baseline With llm-diet Reduction Time
fix authentication bug 46,661 434 99.1% 187ms
add a new API endpoint 46,661 106 99.8% 203ms
debug memory leak 46,661 428 99.1% 172ms
add logging to the pipeline 46,661 58 99.9% 141ms

46,661 → 275 tokens average. 176ms overhead.

FastAPI repo (946k tokens — repo never seen before)

Query Baseline With llm-diet Reduction Time
fix authentication bug 946,210 87 >99.9% 359ms
add a new API endpoint 946,210 120 >99.9% 359ms
how does the database connection work 946,210 130 >99.9% 391ms
debug memory leak 946,210 436 >99.9% 1062ms
add input validation 946,210 244 >99.9% 375ms
explain the caching logic 946,210 114 >99.9% 313ms
fix error handling 946,210 221 >99.9% 406ms
add logging to the pipeline 946,210 136 >99.9% 328ms

946,210 → 186 tokens injected. 5x cheaper sessions in practice.

How It Works

repo files (.py, .js, .ts, .jsx, .tsx)
   ↓
AST parser  (no LLM — pure tree-sitter)
   ↓
call graph  (.cecl/graph.json)
   ↓
query  →  keyword expansion  →  BFS traversal  →  top 5 functions
   ↓
injected into Claude before reasoning starts

Same query + same graph = same result. Deterministic by design.

MCP Shadow Server (new in 0.1.7)

By default, Claude Code explores your codebase after receiving injected context — reading files directly even when we've already told it what's relevant.

The shadow server fixes this at the transport layer. When context-engine install detects a built graph, it registers a local MCP server in .mcp.json. Claude Code routes all read_file calls through this server, which returns compressed call-graph versions instead of raw files.

Real numbers on a 40-node project (coupon-hunter-poc):

File Original Compressed Reduction
playwright_amazon.py 6,590 chars 872 chars 86%
orchestrator.py 10,492 chars 2,169 chars 79%
connectors/playwright_amazon.py 3,067 chars 631 chars 79%

Overall: 32,856 → 10,044 chars across all indexed files (69% reduction, ~5,700 tokens saved per full codebase read)

Files not in the graph pass through unchanged. Binary files are skipped. Large unindexed files (>50k chars) are truncated to 200 lines with a note to run context-engine index.

diet-run (new in 0.1.8)

diet-run is a CLI wrapper that launches Claude Code in fully enforced Low Bandwidth Mode:

diet-run                    # run in current directory
diet-run /path/to/project   # run in specific directory

What it does:

  • Sets LLM_DIET_STRICT=1 — unindexed files return an error instead of raw content
  • Passes --mcp-config .mcp.json — shadow server is the only file reader
  • Passes --disallowed-tools Read — Claude's built-in Read tool is blocked
  • Requires .cecl/graph.json and .mcp.json to exist before launching

Run context-engine install first to set up the shadow server, then use diet-run instead of claude to open sessions.

Quick Start

pip install llm-diet
context-engine install    # indexes repo + configures your AI tool
# open Claude Code and start coding

Commands

Command Description
context-engine install Index repo and configure Claude Code / Cursor / Windsurf
context-engine index . (Re)build the call graph
context-engine query "fix auth bug" See what would be injected for a query
context-engine apply "add endpoint" Plan → diff → validate → patch (needs ANTHROPIC_API_KEY)
context-engine watch . Auto-reindex on file save

Platform Support

Platform Integration Token reduction
Claude Code UserPromptSubmit hook — dynamic injection on every prompt Full (186 tokens avg)
Cursor Static rules file written to .cursor/rules/ Guides AI; no dynamic injection
Windsurf Static rules file written to .windsurf/rules/ Guides AI; no dynamic injection

Full token reduction verified on Claude Code. Cursor/Windsurf dynamic injection on the roadmap.

Why Not RAG?

llm-diet Embeddings / RAG code-review-graph
Retrieval method AST + call graph Vector similarity AST + SQLite
LLM calls to retrieve 0 1+ 0
Deterministic Yes No Yes
Setup pip install + index Model + DB infra pip install + build
Languages Python, JS, TS, JSX, TSX Any 23 languages
Autonomous apply Yes No No
Works offline Yes No Yes

We do less. What we do, we do surgically.

Contributing

Good first issues:

  • Dynamic injection for Cursor/Windsurf — extend beyond Claude Code's UserPromptSubmit
  • More language parsers — add Go, Rust, Java following the FileParseResult interface in parser.py
  • Better keyword expansion — improve domain-specific term mapping in retrieval.py

Open an issue or send a PR.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_diet-0.1.8.tar.gz (69.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_diet-0.1.8-py3-none-any.whl (76.0 kB view details)

Uploaded Python 3

File details

Details for the file llm_diet-0.1.8.tar.gz.

File metadata

  • Download URL: llm_diet-0.1.8.tar.gz
  • Upload date:
  • Size: 69.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for llm_diet-0.1.8.tar.gz
Algorithm Hash digest
SHA256 4c74cdbc75e001cb14e3ebbc477a458bbfa7496dda3c034498b59047e058ec38
MD5 c6887bc6701c2f296018b8f5c771c7e6
BLAKE2b-256 49ca4b790acbc55b23a3bebd332a936c9e914683b2813746ec6f321f6a71048e

See more details on using hashes here.

File details

Details for the file llm_diet-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: llm_diet-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 76.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for llm_diet-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 e8822c56e9b43d8b6b7f9e94fe7adc12bf87403addaf57607843939b200b20a7
MD5 14cab11aa69b165efc2ce0c43c3a1e7b
BLAKE2b-256 fa43c759fa55b8d853c29f0166de3045b274b9f49e9ee1860449e56bd6463a89

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page