Repository-level code context engine — find the right code, fast.
Project description
kinetic-context
Repository-level code context engine — find the right code, fast.
kinetic-context is a self-contained, installable code context engine. Point it at any repository and it builds a multi-layer index (AST chunks + embeddings + a code knowledge graph + BM25) that you can query with natural language or identifiers. It returns code blocks with line ranges — not just file paths — ready to paste into an LLM prompt.
It is designed to be the context layer for coding agents. It ships with a Rich CLI, an MCP server (so Claude Code, Cursor, Continue, and Zed can mount it natively), a TCP JSON mode for any other agent, and a Python library.
Highlights
- Multi-language: Python, JavaScript, TypeScript, Go, Rust, Java — via tree-sitter.
- Structure-aware chunking (
cAST): never splits a function mid-body. 35% overlap. Hierarchical summaries at file and repository level. - Hybrid retrieval: dense (Mistral Codestral Embed, 1536-dim) + BM25 (code-aware tokenizer that splits camelCase, snake_case, and dotted identifiers) + a Code Knowledge Graph with 8 relationship types.
- Reciprocal Rank Fusion with brute-force-optimized weights across 5 channels (dense, BM25, graph, PRF, patch-reverse-engineering).
- Zerank-2 reranker with instruction-following via XML tags, tuned per query intent.
- Novel signals invented for this engine:
- Cross-resolution resonance — L2+L3+L4 consensus boost
- Code DNA fingerprinting — structural similarity (param count, complexity, call count)
- Semantic bridges — virtual graph edges between similar functions
- File cohort memory — files that co-occur in correct answers get boosted across queries
- Score distribution shape analysis — adaptive cutoff detects bimodal / uniform / power-law score distributions
- Score gap amplification — sigmoid sharpening of close scores
- Post-rerank filename tie-breaker — the fix for "correct directory, wrong file" at scale
- Adversarial anti-centroid — penalty for chunks near the worst BM25 hits
- Query-to-patch reverse engineering — generate a hypothetical fix, embed it as a 5th RRF channel
- Incremental indexing with SHA-256 Merkle root hashing. Only changed files are re-embedded. Subsequent
indexruns are fast. - Per-repo isolated storage under
~/.kinetic/<slug>_<hash>/. Same-name folders in different parents do not collide. Amanifest.jsonrecords the source path, root hash, file count, and last-indexed timestamp sokinetic statuscan answer "reindex needed?" in <50ms without spinning up the engine. - Hookable everywhere: MCP server (stdio JSON-RPC), TCP JSON, Python library, Rich CLI.
Install
pip install kinetic-context
Requires Python 3.10+. All heavy dependencies (tree-sitter, numpy, networkx, rich) are bundled — no Qdrant/Pinecone/Milvus/Postgres to install.
Set your API keys (a Mistral key for embeddings + a Zerank key for reranking):
export MISTRAL_API_KEY=...
export ZEROENTROPY_API_KEY=...
You can also override the embedder / reranker URLs and model IDs in KCEConfig if you want to point at a different provider.
Quick start
# Index a repo (first run; takes a few minutes for a 1k-file repo)
kinetic index ./my-repo
# Subsequent runs only re-embed files whose SHA-256 hash changed
kinetic index ./my-repo
# Search — returns code blocks with line ranges, not just file paths
kinetic query "authentication middleware" --code
# JSON output (for agent hooks / piping)
kinetic query "how does routing work" --json | jq '.results[0]'
# Check if the index is up to date
kinetic status ./my-repo
# List all indexed repos
kinetic list
Query output example
────────────────────────────────────────────────────────────────────────
kinetic-context • semantic_question • zerank • 68ms
────────────────────────────────────────────────────────────────────────
Query: how does routing work in flask
────────────────────────────────────────────────────────────────────────
Found 8 code blocks across 8 files
#1 src/flask/sansio/app.py L240-262 (23 lines) • add_url_rule (function) • python
def add_url_rule(self, rule, endpoint=None, view_func=None, ...)
240 │ def add_url_rule(
241 │ self,
242 │ rule: str,
243 │ endpoint: str | None = None,
...
Hooking into coding agents
MCP server (Claude Code, Cursor, Continue, Zed, anything MCP-aware)
Start the MCP server in the background:
kinetic mcp
Or add it to your agent's MCP config:
{
"mcpServers": {
"kinetic": {
"command": "kinetic",
"args": ["mcp"]
}
}
}
Four tools are exposed:
| Tool | Description |
|---|---|
kinetic_index |
Index (or incrementally update) a repo |
kinetic_query |
Search the indexed codebase, returns code blocks with line ranges |
kinetic_status |
Check whether the index is up to date |
kinetic_list_indexes |
List all indexed repos |
TCP JSON server (any agent)
kinetic serve --port 7878
Send a single-line JSON request, get a single-line JSON response:
echo '{"query": "session cookie signing", "top": 5}' | nc localhost 7878
Python library
from kce.engine import KCEEngine
from kce.config import KCEConfig
from kce.store.registry import Registry
cfg = KCEConfig()
cfg.index_dir = Registry().resolve("/path/to/repo")
cfg.ensure_dirs()
engine = KCEEngine(cfg)
engine.index("/path/to/repo")
result = engine.query("how does authentication work")
for cid in result.final_chunk_ids[:10]:
chunk = engine.chunk_index[cid]
print(f"{chunk.rel_path}:{chunk.start_line}-{chunk.end_line} {chunk.name}")
print(chunk.content)
Storage layout
Every indexed repo gets its own directory under ~/.kinetic/:
~/.kinetic/
flask_f9cbd6f6/ <- slug = name + short hash of abspath
manifest.json <- source path, root hash, file count, last indexed
chunks.jsonl <- one CodeChunk per line
embeddings.npy <- (N, 1536) float32 matrix
embeddings_ids.json <- row order
ckg.graphml <- Code Knowledge Graph
incremental_state.json <- per-file SHA-256 hashes
change_log.jsonl <- append-only change history
embed_cache/embeddings.db <- Mistral API cache (SQLite)
zerank_cache/rerank.db <- Zerank API cache (SQLite)
file_summaries.json
repo_summary.txt
django_ea20f8f1/
...
Why per-repo isolation?
- Two repos with the same folder name (e.g.
~/work/apiand~/side/api) get different slugs because the slug includes a short hash of the absolute path. No collisions. - Caches live next to the index, so deleting a repo's index (
kinetic forget) also frees its cache. - The
manifest.jsonrecords a Merkle root hash over all file content hashes. Recomputing this on startup is <50ms even for a 1k-file repo, sokinetic statusis instant.
Efficient incremental updates
- On
kinetic index <repo>, we first compute the current Merkle root from disk. - We compare to the stored root in
manifest.json. If they match, the index is up to date — we're done in <100ms. - If they differ, we walk the per-file SHA-256 hashes in
incremental_state.jsonand identify exactly which files were added, modified, or removed. - Only those files are re-parsed, re-summarized, and re-embedded. Existing chunks for unchanged files are reused.
- The Mistral embed cache (SQLite, keyed by SHA-256 of the embed text) means even a chunk whose text didn't change but whose
chunk_idwas regenerated will not trigger an API call.
For a 1k-file repo where 5 files changed, a re-index takes seconds, not minutes.
Architecture
The pipeline has four layers:
-
Ingestion — tree-sitter parses each source file into an AST. The
cASTchunker walks the AST and produces chunks that respect syntactic boundaries (a function is never split mid-body). Each chunk is enriched with its signature, docstring, decorators, and scope. A Code Knowledge Graph (NetworkX) is built with 8 relationship types:CALLS,INHERITS,IMPLEMENTS,IMPORTS,CONTAINS,USES_TYPE,OVERRIDES,DEPENDS_ON. -
Storage — chunks go to JSONL, embeddings go to a single numpy matrix on disk (+ L2-normalized in memory for fast cosine), the graph goes to GraphML. Each repo gets its own directory under
~/.kinetic/. SHA-256 Merkle root hashing drives incremental updates. -
Retrieval & Ranking — the query coordinator picks one of 5 query types (identifier lookup, semantic question, code completion, bug diagnosis, architecture query). For each type, it applies intent-aware boosts (source vs test vs config files), runs multi-query BM25 + dense + graph retrieval, fuses the results with weighted Reciprocal Rank Fusion, applies novel signals (resonance, DNA, semantic bridges, cohort memory), then reranks the top candidates with Zerank-2.
-
Output — the final ranked chunks are returned as code blocks with line ranges, signature, and docstring. They can be rendered by the Rich CLI, serialized to JSON, or shipped over MCP / TCP to any coding agent.
Benchmarks
We benchmark on two real-world repos with hand-curated query sets. We report Context F1@10, Recall@10, Precision@10, and end-to-end latency. We deliberately do not compare to other context engines — those numbers are easy to get wrong and we'd rather show our own results honestly than risk an unfair comparison.
Aggregate metrics
Per-query F1 (sorted, so the worst queries are at the top)
Query outcome distribution (perfect / partial / failed)
Latency distribution
Quality vs repository scale
Numbers
| Corpus | Files | Chunks | Graph (nodes/edges) | Queries | F1@10 | Recall@10 | Precision@10 | Avg latency |
|---|---|---|---|---|---|---|---|---|
| Flask | 83 | 1,382 | 1,594 / 7,906 | 30 | 0.659 | 0.833 | 0.599 | 68 ms |
| Django (core) | 308 | 7,207 | 6,574 / 46,408 | 30 | 0.647 | 0.867 | 0.559 | 433 ms |
The Django benchmark uses the core Django packages (django/db/, django/http/, django/urls/, django/template/, django/forms/, django/core/, django/contrib/auth/, django/contrib/sessions/) — the parts of Django that real coding agents actually search. The full Django repo includes 3,000+ files of migrations, tests, and docs that bloat the index without improving retrieval quality on real-world queries.
Run the benchmarks yourself:
git clone https://github.com/pallets/flask /tmp/flask
git clone https://github.com/django/django /tmp/django
kinetic index /tmp/flask
kinetic index /tmp/django
python scripts/run_bench.py flask
python scripts/run_bench.py django
python scripts/gen_charts.py
Configuration
All knobs are in KCEConfig. The defaults are tuned for the Mistral Codestral Embed + Zerank-2 combination. You can override:
| Setting | Default | What it controls |
|---|---|---|
mistral_embed_model |
codestral-embed |
Embedding model |
zerank_model |
zerank-2 |
Reranker model |
chunk_max_tokens |
512 | Max chunk size |
chunk_overlap_pct |
0.35 | Chunk overlap |
bm25_k1, bm25_b |
1.5, 0.75 | BM25 params |
rrf_dense, rrf_bm25, rrf_graph |
0.75, 0.05, 0.03 | RRF channel weights |
retrieval_top_k |
50 | Candidates per channel |
rerank_top_n |
15 | Candidates sent to reranker |
final_top_n |
10 | Final results returned |
filename_keyword_boost |
2.0 | Post-rerank filename tie-breaker |
tier_c_penalty |
0.4 | Penalty for abstract base classes |
context_budget_tokens |
4096 | Context assembly budget |
API keys are read from MISTRAL_API_KEY and ZEROENTROPY_API_KEY env vars.
Why not just use ripgrep / the IDE's built-in search?
Lexical search finds the keyword. It doesn't find the concept. Asking "how does routing work in flask" with grep gives you every line containing "route" — most of which is irrelevant. kinetic-context returns the 8 functions that actually implement routing, with their full bodies and signatures, in 68ms.
Why not just stuff the whole repo into the LLM context window?
A 1k-file repo is ~500k tokens just for the source. That's expensive, slow, and the LLM's attention degrades badly past ~100k tokens. kinetic-context returns the 10 chunks that matter, fits in any context window, and costs 100x less to query.
Why a Code Knowledge Graph?
Because "what calls what" matters. When you ask "where is request used?", the graph says "the request proxy is defined in flask/globals.py, used in 47 places, and its setter is in flask/app.py". Pure lexical search sees the word request 500 times. The graph sees the relationship.
License
MIT. See LICENSE.
Contributing
Issues and PRs welcome at github.com/zai-org/kinetic-context.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kinetic_context-0.2.1.tar.gz.
File metadata
- Download URL: kinetic_context-0.2.1.tar.gz
- Upload date:
- Size: 71.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b4d0aa971485c37182cc85d7c267e9b4e7124cbad6fdf084fe417ad744b7b55
|
|
| MD5 |
a02fa19b8d5b2afa862f44928bf4058a
|
|
| BLAKE2b-256 |
43bdeb250ae20460b7ce0ce03e7219a2e06a3506e4519671fee3e95953c80056
|
File details
Details for the file kinetic_context-0.2.1-py3-none-any.whl.
File metadata
- Download URL: kinetic_context-0.2.1-py3-none-any.whl
- Upload date:
- Size: 80.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a4d8d1f4222f9939dd671e59866d8fbef28799e7d4ff96724e77e9d1931ebe3
|
|
| MD5 |
ba7ee9bc2ea7f2d41d6bb6dd7a3d0fb8
|
|
| BLAKE2b-256 |
e9544e696a99903948573e1f6cc7f3a4f6c11cef0b296e305d13793755fce304
|