Repository-level code context engine — find the right code, fast.

These details have not been verified by PyPI

Project links

Project description

kinetic-context

Repository-level code context engine — find the right code, fast.

kinetic-context is a self-contained, installable code context engine. Point it at any repository and it builds a multi-layer index (AST chunks + embeddings + a code knowledge graph + BM25) that you can query with natural language or identifiers. It returns code blocks with line ranges — not just file paths — ready to paste into an LLM prompt.

It is designed to be the context layer for coding agents. It ships with a Rich CLI, an MCP server (so Claude Code, Cursor, Continue, and Zed can mount it natively), a TCP JSON mode for any other agent, and a Python library.

Highlights

Multi-language: Python, JavaScript, TypeScript, Go, Rust, Java — via tree-sitter.
Structure-aware chunking (cAST): never splits a function mid-body. 35% overlap. Hierarchical summaries at file and repository level.
Hybrid retrieval: dense (Mistral Codestral Embed, 1536-dim) + BM25 (code-aware tokenizer that splits camelCase, snake_case, and dotted identifiers) + a Code Knowledge Graph with 8 relationship types.
Reciprocal Rank Fusion with brute-force-optimized weights across 5 channels (dense, BM25, graph, PRF, patch-reverse-engineering).
Zerank-2 reranker with instruction-following via XML tags, tuned per query intent.
Novel signals invented for this engine:
- Cross-resolution resonance — L2+L3+L4 consensus boost
- Code DNA fingerprinting — structural similarity (param count, complexity, call count)
- Semantic bridges — virtual graph edges between similar functions
- File cohort memory — files that co-occur in correct answers get boosted across queries
- Score distribution shape analysis — adaptive cutoff detects bimodal / uniform / power-law score distributions
- Score gap amplification — sigmoid sharpening of close scores
- Post-rerank filename tie-breaker — the fix for "correct directory, wrong file" at scale
- Adversarial anti-centroid — penalty for chunks near the worst BM25 hits
- Query-to-patch reverse engineering — generate a hypothetical fix, embed it as a 5th RRF channel
Incremental indexing with SHA-256 Merkle root hashing. Only changed files are re-embedded. Subsequent index runs are fast.
Per-repo isolated storage under ~/.kinetic/<slug>_<hash>/. Same-name folders in different parents do not collide. A manifest.json records the source path, root hash, file count, and last-indexed timestamp so kinetic status can answer "reindex needed?" in <50ms without spinning up the engine.
Hookable everywhere: MCP server (stdio JSON-RPC), TCP JSON, Python library, Rich CLI.

Install

pip install kinetic-context

Requires Python 3.10+. All heavy dependencies (tree-sitter, numpy, networkx, rich) are bundled — no Qdrant/Pinecone/Milvus/Postgres to install.

Set your API keys (a Mistral key for embeddings + a Zerank key for reranking):

export MISTRAL_API_KEY=...
export ZEROENTROPY_API_KEY=...

You can also override the embedder / reranker URLs and model IDs in KCEConfig if you want to point at a different provider.

Quick start

# Index a repo (first run; takes a few minutes for a 1k-file repo)
kinetic index ./my-repo

# Subsequent runs only re-embed files whose SHA-256 hash changed
kinetic index ./my-repo

# Search — returns code blocks with line ranges, not just file paths
kinetic query "authentication middleware" --code

# JSON output (for agent hooks / piping)
kinetic query "how does routing work" --json | jq '.results[0]'

# Check if the index is up to date
kinetic status ./my-repo

# List all indexed repos
kinetic list

Query output example

────────────────────────────────────────────────────────────────────────
 kinetic-context • semantic_question • zerank • 68ms
────────────────────────────────────────────────────────────────────────
 Query: how does routing work in flask
────────────────────────────────────────────────────────────────────────
Found 8 code blocks across 8 files

#1  src/flask/sansio/app.py  L240-262 (23 lines)  • add_url_rule (function)  • python
    def add_url_rule(self, rule, endpoint=None, view_func=None, ...)
  240 │ def add_url_rule(
  241 │     self,
  242 │     rule: str,
  243 │     endpoint: str | None = None,
  ...

Hooking into coding agents

MCP server (Claude Code, Cursor, Continue, Zed, anything MCP-aware)

Start the MCP server in the background:

kinetic mcp

Or add it to your agent's MCP config:

{
  "mcpServers": {
    "kinetic": {
      "command": "kinetic",
      "args": ["mcp"]
    }
  }
}

Four tools are exposed:

Tool	Description
`kinetic_index`	Index (or incrementally update) a repo
`kinetic_query`	Search the indexed codebase, returns code blocks with line ranges
`kinetic_status`	Check whether the index is up to date
`kinetic_list_indexes`	List all indexed repos

TCP JSON server (any agent)

kinetic serve --port 7878

Send a single-line JSON request, get a single-line JSON response:

echo '{"query": "session cookie signing", "top": 5}' | nc localhost 7878

Python library

from kce.engine import KCEEngine
from kce.config import KCEConfig
from kce.store.registry import Registry

cfg = KCEConfig()
cfg.index_dir = Registry().resolve("/path/to/repo")
cfg.ensure_dirs()
engine = KCEEngine(cfg)
engine.index("/path/to/repo")

result = engine.query("how does authentication work")
for cid in result.final_chunk_ids[:10]:
    chunk = engine.chunk_index[cid]
    print(f"{chunk.rel_path}:{chunk.start_line}-{chunk.end_line}  {chunk.name}")
    print(chunk.content)

Storage layout

Every indexed repo gets its own directory under ~/.kinetic/:

~/.kinetic/
  flask_f9cbd6f6/                       <- slug = name + short hash of abspath
    manifest.json                       <- source path, root hash, file count, last indexed
    chunks.jsonl                        <- one CodeChunk per line
    embeddings.npy                      <- (N, 1536) float32 matrix
    embeddings_ids.json                 <- row order
    ckg.graphml                         <- Code Knowledge Graph
    incremental_state.json              <- per-file SHA-256 hashes
    change_log.jsonl                    <- append-only change history
    embed_cache/embeddings.db           <- Mistral API cache (SQLite)
    zerank_cache/rerank.db              <- Zerank API cache (SQLite)
    file_summaries.json
    repo_summary.txt
  django_ea20f8f1/
    ...

Why per-repo isolation?

Two repos with the same folder name (e.g. ~/work/api and ~/side/api) get different slugs because the slug includes a short hash of the absolute path. No collisions.
Caches live next to the index, so deleting a repo's index (kinetic forget) also frees its cache.
The manifest.json records a Merkle root hash over all file content hashes. Recomputing this on startup is <50ms even for a 1k-file repo, so kinetic status is instant.

Efficient incremental updates

On kinetic index <repo>, we first compute the current Merkle root from disk.
We compare to the stored root in manifest.json. If they match, the index is up to date — we're done in <100ms.
If they differ, we walk the per-file SHA-256 hashes in incremental_state.json and identify exactly which files were added, modified, or removed.
Only those files are re-parsed, re-summarized, and re-embedded. Existing chunks for unchanged files are reused.
The Mistral embed cache (SQLite, keyed by SHA-256 of the embed text) means even a chunk whose text didn't change but whose chunk_id was regenerated will not trigger an API call.

For a 1k-file repo where 5 files changed, a re-index takes seconds, not minutes.

Architecture

The pipeline has four layers:

Ingestion — tree-sitter parses each source file into an AST. The cAST chunker walks the AST and produces chunks that respect syntactic boundaries (a function is never split mid-body). Each chunk is enriched with its signature, docstring, decorators, and scope. A Code Knowledge Graph (NetworkX) is built with 8 relationship types: CALLS, INHERITS, IMPLEMENTS, IMPORTS, CONTAINS, USES_TYPE, OVERRIDES, DEPENDS_ON.
Storage — chunks go to JSONL, embeddings go to a single numpy matrix on disk (+ L2-normalized in memory for fast cosine), the graph goes to GraphML. Each repo gets its own directory under ~/.kinetic/. SHA-256 Merkle root hashing drives incremental updates.
Retrieval & Ranking — the query coordinator picks one of 5 query types (identifier lookup, semantic question, code completion, bug diagnosis, architecture query). For each type, it applies intent-aware boosts (source vs test vs config files), runs multi-query BM25 + dense + graph retrieval, fuses the results with weighted Reciprocal Rank Fusion, applies novel signals (resonance, DNA, semantic bridges, cohort memory), then reranks the top candidates with Zerank-2.
Output — the final ranked chunks are returned as code blocks with line ranges, signature, and docstring. They can be rendered by the Rich CLI, serialized to JSON, or shipped over MCP / TCP to any coding agent.

Benchmarks

We benchmark on two real-world repos with hand-curated query sets. We report Context F1@10, Recall@10, Precision@10, and end-to-end latency. We deliberately do not compare to other context engines — those numbers are easy to get wrong and we'd rather show our own results honestly than risk an unfair comparison.

Aggregate metrics

Per-query F1 (sorted, so the worst queries are at the top)

Per-query F1

Query outcome distribution (perfect / partial / failed)

Perfect vs failed

Latency distribution

Quality vs repository scale

Scale vs quality

Numbers

Corpus	Files	Chunks	Graph (nodes/edges)	Queries	F1@10	Recall@10	Precision@10	Avg latency
Flask	83	1,382	1,594 / 7,906	30	0.659	0.833	0.599	68 ms
Django (core)	308	7,207	6,574 / 46,408	30	0.647	0.867	0.559	433 ms

The Django benchmark uses the core Django packages (django/db/, django/http/, django/urls/, django/template/, django/forms/, django/core/, django/contrib/auth/, django/contrib/sessions/) — the parts of Django that real coding agents actually search. The full Django repo includes 3,000+ files of migrations, tests, and docs that bloat the index without improving retrieval quality on real-world queries.

Run the benchmarks yourself:

git clone https://github.com/pallets/flask /tmp/flask
git clone https://github.com/django/django /tmp/django
kinetic index /tmp/flask
kinetic index /tmp/django
python scripts/run_bench.py flask
python scripts/run_bench.py django
python scripts/gen_charts.py

Configuration

All knobs are in KCEConfig. The defaults are tuned for the Mistral Codestral Embed + Zerank-2 combination. You can override:

Setting	Default	What it controls
`mistral_embed_model`	`codestral-embed`	Embedding model
`zerank_model`	`zerank-2`	Reranker model
`chunk_max_tokens`	512	Max chunk size
`chunk_overlap_pct`	0.35	Chunk overlap
`bm25_k1`, `bm25_b`	1.5, 0.75	BM25 params
`rrf_dense`, `rrf_bm25`, `rrf_graph`	0.75, 0.05, 0.03	RRF channel weights
`retrieval_top_k`	50	Candidates per channel
`rerank_top_n`	15	Candidates sent to reranker
`final_top_n`	10	Final results returned
`filename_keyword_boost`	2.0	Post-rerank filename tie-breaker
`tier_c_penalty`	0.4	Penalty for abstract base classes
`context_budget_tokens`	4096	Context assembly budget

API keys are read from MISTRAL_API_KEY and ZEROENTROPY_API_KEY env vars.

Why not just use ripgrep / the IDE's built-in search?

Lexical search finds the keyword. It doesn't find the concept. Asking "how does routing work in flask" with grep gives you every line containing "route" — most of which is irrelevant. kinetic-context returns the 8 functions that actually implement routing, with their full bodies and signatures, in 68ms.

Why not just stuff the whole repo into the LLM context window?

A 1k-file repo is ~500k tokens just for the source. That's expensive, slow, and the LLM's attention degrades badly past ~100k tokens. kinetic-context returns the 10 chunks that matter, fits in any context window, and costs 100x less to query.

Why a Code Knowledge Graph?

Because "what calls what" matters. When you ask "where is request used?", the graph says "the request proxy is defined in flask/globals.py, used in 47 places, and its setter is in flask/app.py". Pure lexical search sees the word request 500 times. The graph sees the relationship.

License

MIT. See LICENSE.

Contributing

Issues and PRs welcome at github.com/zai-org/kinetic-context.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0 yanked

Jun 28, 2026

0.2.2 yanked

Jun 28, 2026

This version

0.2.1 yanked

Jun 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kinetic_context-0.2.1.tar.gz (71.4 kB view details)

Uploaded Jun 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kinetic_context-0.2.1-py3-none-any.whl (80.4 kB view details)

Uploaded Jun 28, 2026 Python 3

File details

Details for the file kinetic_context-0.2.1.tar.gz.

File metadata

Download URL: kinetic_context-0.2.1.tar.gz
Upload date: Jun 28, 2026
Size: 71.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for kinetic_context-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`1b4d0aa971485c37182cc85d7c267e9b4e7124cbad6fdf084fe417ad744b7b55`
MD5	`a02fa19b8d5b2afa862f44928bf4058a`
BLAKE2b-256	`43bdeb250ae20460b7ce0ce03e7219a2e06a3506e4519671fee3e95953c80056`

See more details on using hashes here.

File details

Details for the file kinetic_context-0.2.1-py3-none-any.whl.

File metadata

Download URL: kinetic_context-0.2.1-py3-none-any.whl
Upload date: Jun 28, 2026
Size: 80.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for kinetic_context-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a4d8d1f4222f9939dd671e59866d8fbef28799e7d4ff96724e77e9d1931ebe3`
MD5	`ba7ee9bc2ea7f2d41d6bb6dd7a3d0fb8`
BLAKE2b-256	`e9544e696a99903948573e1f6cc7f3a4f6c11cef0b296e305d13793755fce304`

See more details on using hashes here.

kinetic-context 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

kinetic-context

Highlights

Install

Quick start

Query output example

Hooking into coding agents

MCP server (Claude Code, Cursor, Continue, Zed, anything MCP-aware)

TCP JSON server (any agent)

Python library

Storage layout

Why per-repo isolation?

Efficient incremental updates

Architecture

Benchmarks

Aggregate metrics

Per-query F1 (sorted, so the worst queries are at the top)

Query outcome distribution (perfect / partial / failed)

Latency distribution

Quality vs repository scale

Numbers

Configuration

Why not just use ripgrep / the IDE's built-in search?

Why not just stuff the whole repo into the LLM context window?

Why a Code Knowledge Graph?

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes