Skip to main content

Repository-level code context engine — find the right code, fast.

Project description

kinetic-context

Repository-level code context engine — find the right code, fast.

kinetic-context is a self-contained, installable code context engine. Point it at any repository and it builds a multi-layer index (AST chunks + embeddings + a code knowledge graph + BM25) that you can query with natural language or identifiers. It returns code blocks with line ranges — not just file paths — ready to paste into an LLM prompt.

It is designed to be the context layer for coding agents. It ships with a Rich CLI, an MCP server (so Claude Code, Cursor, Continue, and Zed can mount it natively), a TCP JSON mode for any other agent, and a Python library.


Highlights

  • Multi-language: Python, JavaScript, TypeScript, Go, Rust, Java — via tree-sitter.
  • Structure-aware chunking (cAST): never splits a function mid-body. 35% overlap. Hierarchical summaries at file and repository level.
  • Hybrid retrieval: dense (Mistral Codestral Embed, 1536-dim) + BM25 (code-aware tokenizer that splits camelCase, snake_case, and dotted identifiers) + a Code Knowledge Graph with 8 relationship types.
  • Reciprocal Rank Fusion with brute-force-optimized weights across 5 channels (dense, BM25, graph, PRF, patch-reverse-engineering).
  • Zerank-2 reranker with instruction-following via XML tags, tuned per query intent.
  • Novel signals invented for this engine:
    • Cross-resolution resonance — L2+L3+L4 consensus boost
    • Code DNA fingerprinting — structural similarity (param count, complexity, call count)
    • Semantic bridges — virtual graph edges between similar functions
    • File cohort memory — files that co-occur in correct answers get boosted across queries
    • Score distribution shape analysis — adaptive cutoff detects bimodal / uniform / power-law score distributions
    • Score gap amplification — sigmoid sharpening of close scores
    • Post-rerank filename tie-breaker — the fix for "correct directory, wrong file" at scale
    • Adversarial anti-centroid — penalty for chunks near the worst BM25 hits
    • Query-to-patch reverse engineering — generate a hypothetical fix, embed it as a 5th RRF channel
  • Incremental indexing with SHA-256 Merkle root hashing. Only changed files are re-embedded. Subsequent index runs are fast.
  • Per-repo isolated storage under ~/.kinetic/<slug>_<hash>/. Same-name folders in different parents do not collide. A manifest.json records the source path, root hash, file count, and last-indexed timestamp so kinetic status can answer "reindex needed?" in <50ms without spinning up the engine.
  • Hookable everywhere: MCP server (stdio JSON-RPC), TCP JSON, Python library, Rich CLI.

Install

pip install kinetic-context

Requires Python 3.10+. All heavy dependencies (tree-sitter, numpy, networkx, rich) are bundled — no Qdrant/Pinecone/Milvus/Postgres to install.

Set your API keys (a Mistral key for embeddings + a Zerank key for reranking):

export MISTRAL_API_KEY=...
export ZEROENTROPY_API_KEY=...

You can also override the embedder / reranker URLs and model IDs in KCEConfig if you want to point at a different provider.


Quick start

# Index a repo (first run; takes a few minutes for a 1k-file repo)
kinetic index ./my-repo

# Subsequent runs only re-embed files whose SHA-256 hash changed
kinetic index ./my-repo

# Search — returns code blocks with line ranges, not just file paths
kinetic query "authentication middleware" --code

# JSON output (for agent hooks / piping)
kinetic query "how does routing work" --json | jq '.results[0]'

# Check if the index is up to date
kinetic status ./my-repo

# List all indexed repos
kinetic list

Query output example

────────────────────────────────────────────────────────────────────────
 kinetic-context • semantic_question • zerank • 68ms
────────────────────────────────────────────────────────────────────────
 Query: how does routing work in flask
────────────────────────────────────────────────────────────────────────
Found 8 code blocks across 8 files

#1  src/flask/sansio/app.py  L240-262 (23 lines)  • add_url_rule (function)  • python
    def add_url_rule(self, rule, endpoint=None, view_func=None, ...)
  240 │ def add_url_rule(
  241 │     self,
  242 │     rule: str,
  243 │     endpoint: str | None = None,
  ...

Hooking into coding agents

MCP server (Claude Code, Cursor, Continue, Zed, anything MCP-aware)

Start the MCP server in the background:

kinetic mcp

Or add it to your agent's MCP config:

{
  "mcpServers": {
    "kinetic": {
      "command": "kinetic",
      "args": ["mcp"]
    }
  }
}

Four tools are exposed:

Tool Description
kinetic_index Index (or incrementally update) a repo
kinetic_query Search the indexed codebase, returns code blocks with line ranges
kinetic_status Check whether the index is up to date
kinetic_list_indexes List all indexed repos

TCP JSON server (any agent)

kinetic serve --port 7878

Send a single-line JSON request, get a single-line JSON response:

echo '{"query": "session cookie signing", "top": 5}' | nc localhost 7878

Python library

from kce.engine import KCEEngine
from kce.config import KCEConfig
from kce.store.registry import Registry

cfg = KCEConfig()
cfg.index_dir = Registry().resolve("/path/to/repo")
cfg.ensure_dirs()
engine = KCEEngine(cfg)
engine.index("/path/to/repo")

result = engine.query("how does authentication work")
for cid in result.final_chunk_ids[:10]:
    chunk = engine.chunk_index[cid]
    print(f"{chunk.rel_path}:{chunk.start_line}-{chunk.end_line}  {chunk.name}")
    print(chunk.content)

Storage layout

Every indexed repo gets its own directory under ~/.kinetic/:

~/.kinetic/
  flask_f9cbd6f6/                       <- slug = name + short hash of abspath
    manifest.json                       <- source path, root hash, file count, last indexed
    chunks.jsonl                        <- one CodeChunk per line
    embeddings.npy                      <- (N, 1536) float32 matrix
    embeddings_ids.json                 <- row order
    ckg.graphml                         <- Code Knowledge Graph
    incremental_state.json              <- per-file SHA-256 hashes
    change_log.jsonl                    <- append-only change history
    embed_cache/embeddings.db           <- Mistral API cache (SQLite)
    zerank_cache/rerank.db              <- Zerank API cache (SQLite)
    file_summaries.json
    repo_summary.txt
  django_ea20f8f1/
    ...

Why per-repo isolation?

  • Two repos with the same folder name (e.g. ~/work/api and ~/side/api) get different slugs because the slug includes a short hash of the absolute path. No collisions.
  • Caches live next to the index, so deleting a repo's index (kinetic forget) also frees its cache.
  • The manifest.json records a Merkle root hash over all file content hashes. Recomputing this on startup is <50ms even for a 1k-file repo, so kinetic status is instant.

Efficient incremental updates

  1. On kinetic index <repo>, we first compute the current Merkle root from disk.
  2. We compare to the stored root in manifest.json. If they match, the index is up to date — we're done in <100ms.
  3. If they differ, we walk the per-file SHA-256 hashes in incremental_state.json and identify exactly which files were added, modified, or removed.
  4. Only those files are re-parsed, re-summarized, and re-embedded. Existing chunks for unchanged files are reused.
  5. The Mistral embed cache (SQLite, keyed by SHA-256 of the embed text) means even a chunk whose text didn't change but whose chunk_id was regenerated will not trigger an API call.

For a 1k-file repo where 5 files changed, a re-index takes seconds, not minutes.


Architecture

Architecture

The pipeline has four layers:

  1. Ingestion — tree-sitter parses each source file into an AST. The cAST chunker walks the AST and produces chunks that respect syntactic boundaries (a function is never split mid-body). Each chunk is enriched with its signature, docstring, decorators, and scope. A Code Knowledge Graph (NetworkX) is built with 8 relationship types: CALLS, INHERITS, IMPLEMENTS, IMPORTS, CONTAINS, USES_TYPE, OVERRIDES, DEPENDS_ON.

  2. Storage — chunks go to JSONL, embeddings go to a single numpy matrix on disk (+ L2-normalized in memory for fast cosine), the graph goes to GraphML. Each repo gets its own directory under ~/.kinetic/. SHA-256 Merkle root hashing drives incremental updates.

  3. Retrieval & Ranking — the query coordinator picks one of 5 query types (identifier lookup, semantic question, code completion, bug diagnosis, architecture query). For each type, it applies intent-aware boosts (source vs test vs config files), runs multi-query BM25 + dense + graph retrieval, fuses the results with weighted Reciprocal Rank Fusion, applies novel signals (resonance, DNA, semantic bridges, cohort memory), then reranks the top candidates with Zerank-2.

  4. Output — the final ranked chunks are returned as code blocks with line ranges, signature, and docstring. They can be rendered by the Rich CLI, serialized to JSON, or shipped over MCP / TCP to any coding agent.


Benchmarks

We benchmark on two real-world repos with hand-curated query sets. We report Context F1@10, Recall@10, Precision@10, and end-to-end latency. We deliberately do not compare to other context engines — those numbers are easy to get wrong and we'd rather show our own results honestly than risk an unfair comparison.

Aggregate metrics

Aggregate metrics

Per-query F1 (sorted, so the worst queries are at the top)

Per-query F1

Query outcome distribution (perfect / partial / failed)

Perfect vs failed

Latency distribution

Latency distribution

Quality vs repository scale

Scale vs quality

Numbers

Corpus Files Chunks Graph (nodes/edges) Queries F1@10 Recall@10 Precision@10 Avg latency
Flask 83 1,382 1,594 / 7,906 30 0.659 0.833 0.599 68 ms
Django (core) 308 7,207 6,574 / 46,408 30 0.647 0.867 0.559 433 ms

The Django benchmark uses the core Django packages (django/db/, django/http/, django/urls/, django/template/, django/forms/, django/core/, django/contrib/auth/, django/contrib/sessions/) — the parts of Django that real coding agents actually search. The full Django repo includes 3,000+ files of migrations, tests, and docs that bloat the index without improving retrieval quality on real-world queries.

Run the benchmarks yourself:

git clone https://github.com/pallets/flask /tmp/flask
git clone https://github.com/django/django /tmp/django
kinetic index /tmp/flask
kinetic index /tmp/django
python scripts/run_bench.py flask
python scripts/run_bench.py django
python scripts/gen_charts.py

Configuration

All knobs are in KCEConfig. The defaults are tuned for the Mistral Codestral Embed + Zerank-2 combination. You can override:

Setting Default What it controls
mistral_embed_model codestral-embed Embedding model
zerank_model zerank-2 Reranker model
chunk_max_tokens 512 Max chunk size
chunk_overlap_pct 0.35 Chunk overlap
bm25_k1, bm25_b 1.5, 0.75 BM25 params
rrf_dense, rrf_bm25, rrf_graph 0.75, 0.05, 0.03 RRF channel weights
retrieval_top_k 50 Candidates per channel
rerank_top_n 15 Candidates sent to reranker
final_top_n 10 Final results returned
filename_keyword_boost 2.0 Post-rerank filename tie-breaker
tier_c_penalty 0.4 Penalty for abstract base classes
context_budget_tokens 4096 Context assembly budget

API keys are read from MISTRAL_API_KEY and ZEROENTROPY_API_KEY env vars.


Why not just use ripgrep / the IDE's built-in search?

Lexical search finds the keyword. It doesn't find the concept. Asking "how does routing work in flask" with grep gives you every line containing "route" — most of which is irrelevant. kinetic-context returns the 8 functions that actually implement routing, with their full bodies and signatures, in 68ms.

Why not just stuff the whole repo into the LLM context window?

A 1k-file repo is ~500k tokens just for the source. That's expensive, slow, and the LLM's attention degrades badly past ~100k tokens. kinetic-context returns the 10 chunks that matter, fits in any context window, and costs 100x less to query.

Why a Code Knowledge Graph?

Because "what calls what" matters. When you ask "where is request used?", the graph says "the request proxy is defined in flask/globals.py, used in 47 places, and its setter is in flask/app.py". Pure lexical search sees the word request 500 times. The graph sees the relationship.


License

MIT. See LICENSE.

Contributing

Issues and PRs welcome at github.com/notlousybook/kinetic-context.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kinetic_context-0.2.2.tar.gz (71.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kinetic_context-0.2.2-py3-none-any.whl (80.5 kB view details)

Uploaded Python 3

File details

Details for the file kinetic_context-0.2.2.tar.gz.

File metadata

  • Download URL: kinetic_context-0.2.2.tar.gz
  • Upload date:
  • Size: 71.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for kinetic_context-0.2.2.tar.gz
Algorithm Hash digest
SHA256 5588a5ce5924784a3bcf6ad5ea8d0f6699faac9e48e9934ba635cce60b1a3757
MD5 9363d022afaaa1302da65a2fa8c35ea7
BLAKE2b-256 66153bfc6038ffcd4f1d9f51f9364d8eb39afbd10017a93146b6f838cbb8982d

See more details on using hashes here.

File details

Details for the file kinetic_context-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for kinetic_context-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b712c91122fabb69bace411dc2a694aedf523face785355d14a5652e66042cfb
MD5 ee59c37cb3358769bdb9cdee27a5cad4
BLAKE2b-256 433644a7237698e886f8513d307737c82eefeeb35c250b2562889cc88f32eb72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page