Coordinate index layer for LLM context — Helix weighs, doesn't retrieve
Project description
Helix Context
Coordinate-index engine for LLM agents. Retrieves, weighs, and compresses your codebase into a context window — without a single LLM call on the retrieval path.
Proof (30 seconds)
WIP benchmark numbers — compressor disabled (default LLM-free config), N=15 query shapes, May 2026:
| metric | tokens | vs standard RAG (top-5 @ 1500) |
|---|---|---|
| median | 2,757 | 2.9× fewer tokens |
| best (focused query) | 1,410 | 5.7× |
| worst (broad 12-doc) | 3,755 | 2.1× |
With the optional compressor enabled (Claude Haiku splice), median improves to ~5×. In multi-turn sessions, the session delivery register elides already-seen documents — observed 37× reduction on repeated retrievals within a conversation.
Reproducer: python benchmarks/bench_rag_vs_sike_tokens.py against your own genome.
Agent contract: every /context response carries know { found, confidence } (grounded — you may answer) or miss { reason, escalate_to } (not found — don't answer from genome). Stale results downgrade to miss(reason="stale"|"cold"|"superseded") via the freshness gate.
Get started (60 seconds)
# 1. Install
pip install helix-context
python -m spacy download en_core_web_sm
# 2. Ingest your codebase
helix ingest path/to/your/project/ --recursive
# 3. Query it
helix query "how does the splice step work?"
# 4. Or start the proxy for IDE integration
helix-server # binds to 127.0.0.1:11437
For extras matrix, BGE-M3 backfill, and tray setup: docs/SETUP.md.
Agent surfaces
Three ways to drive Helix — same retrieval primitives, same JSON shapes:
| Surface | Best for | Example |
|---|---|---|
| CLI | Scripts, CI, cold-start agents | helix query "..." --json |
| MCP | Claude Code, Cursor, Claude Desktop | Add to settings.json |
| HTTP proxy | Continue IDE, OPENAI_BASE_URL redirect |
POST /context |
# CLI — no server, no daemon, subprocess-drivable
helix query "what does the splice step do?" --json
helix packet "edit the splice step" --task-type edit --json
helix gene get abc123 --json
helix neighbors "splice step" --k 10 --json
helix refresh-targets "edit the splice step" --json
helix status
helix diag corpus
Full CLI reference: docs/clients/cli.md.
MCP tool schemas: docs/api/mcp-tools.md.
Pipeline (2 minutes)
Seven stages per turn, all LLM-free except optional splice:
query
│
▼
┌──────────────┐
│ 0. Classify │ rule-based: decoder mode + assembly cap
└──────┬───────┘
▼
┌──────────────┐
│ 1. Extract │ heuristic keyword + entity extraction
└──────┬───────┘
▼
┌──────────────┐ FTS5 BM25 + BGE-M3 dense (1024-dim) + tags
│ 2. Retrieve │ + synonym expansion + co-activation + SR
│ │ ranked via RRF or additive fusion
└──────┬───────┘
▼
┌──────────────┐
│ 3. Re-rank │ CPU classifier scores (optional)
└──────┬───────┘
▼
┌──────────────┐
│ 4. Splice │ Headroom Kompress (CPU) or LLM compressor
└──────┬───────┘
▼
┌──────────────┐ token budget + legibility headers (fired tiers,
│ 5. Assemble │ confidence ◆/◇/⬦, compression ratio) +
│ + Stage 7 │ freshness gate (stale/cold/superseded → miss)
└──────┬───────┘ + session delivery (elide already-seen docs)
▼
┌──────────────┐
│ 6. Persist │ query+response → knowledge store (background)
└──────┘───────┘
▼
know { } or miss { }
- know/miss contract:
knowmeans the context is grounded, agent may answer.missmeans don't answer from genome — escalate viaescalate_totools or refetch fromrefresh_targets. - Caller model class:
/contextacceptscaller_model_class: "generic" | "small_moe" | "frontier"to select render branch (ordering, assembly cap, decoder mode). See docs/api/context-endpoint.md §7.
Configuration (17 sections in helix.toml)
| Section | Key settings |
|---|---|
[ribosome] |
enabled, backend ("none" / "litellm" / "claude" / "deberta"), query_expansion |
[hardware] |
Device auto-detection (CUDA → ROCm → MPS → CPU) |
[budget] |
expression_tokens (7k default), max_genes_per_turn, splice_aggressiveness, legibility_enabled, session_delivery_enabled |
[session] |
Synthetic session windows, default party_id |
[genome] |
path (genomes/main/genome.db), compact_interval, replicas |
[server] |
host, port, upstream |
[headroom] |
Optional Headroom proxy lifecycle |
[ingestion] |
backend ("cpu" / "ollama"), splade_enabled, entity_graph |
[context] |
Cold-tier retrieval: enabled, k, min_cosine |
[cymatics] |
Frequency-domain scoring, harmonic_links, distance_metric |
[classifier] |
Rule-based query classification thresholds |
[retrieval] |
fusion_mode ("additive" / "rrf"), SR, ray_trace_theta, seeded_edges |
[plr] |
Piecewise linear reranker model |
[know] |
Know/miss calibration: confidence_floor, margin_threshold |
[mem_sync] |
Auto-memory → helix sync: watch_dirs, interval |
[synonyms] |
Query expansion map (e.g., "cache" → ["redis", "ttl"]) |
[abstain] |
Low-confidence abstention thresholds |
Full reference: docs/config-reference.md.
Full endpoint reference
Core retrieval:
| Endpoint | Purpose |
|---|---|
POST /context |
know/miss + expressed_context (primary) |
POST /context/packet |
Agent-safe bundle: verified / stale_risk / refresh_targets |
POST /context/refresh-plan |
Refresh targets only (reread plan) |
POST /fingerprint |
Navigation-first payload (scores, no body) |
GET /context/expand |
1-hop neighborhood from a gene_id |
POST /v1/chat/completions |
OpenAI-compatible proxy |
Ingestion + maintenance:
| Endpoint | Purpose |
|---|---|
POST /ingest |
Add content to the knowledge store |
POST /consolidate |
Rewrite stale docs from source fingerprints |
POST /admin/refresh |
Force retrieval-layer refresh |
POST /admin/vacuum |
Reclaim SQLite pages |
POST /admin/swap-db |
Hot-swap the .db file without restart |
Identity + sessions:
| Endpoint | Purpose |
|---|---|
POST /sessions/register |
Register agent participant |
GET /sessions |
List registered participants |
GET /session/{id}/manifest |
Session delivery log |
POST /hitl/emit |
Record HITL pause event |
Diagnostics:
| Endpoint | Purpose |
|---|---|
GET /stats |
Corpus metrics + compression ratio |
GET /health |
Model, doc count, calibration provenance |
GET /genes/{gene_id} |
Single document detail |
GET /debug/resonance |
Tier activation profile |
GET /metrics/tokens |
Token usage counters |
Full schema: docs/api/endpoints.md.
Package structure (16 packages, post-PR #90)
| Package | Purpose |
|---|---|
adapters/ |
Cache, DAL, external retriever protocol |
backends/ |
Compressor, BGE-M3 codec, DeBERTa, NLI, SEMA, SPLADE |
cli/ |
helix CLI: query, packet, gene, neighbors, ingest, diag, config, status |
encoding/ |
Chunking, fragments, legibility headers, Headroom bridge |
identity/ |
CWoLa logger, session delivery, registry, provenance, claims |
pipeline/ |
Tier logic, stage helpers |
retrieval/ |
Expand, freshness, RRF/additive fusion, PLR, intent router, SR, seeded edges, query classifier |
scoring/ |
Cymatics, know-calibration, know-decision, ray-trace, TCM |
server/ |
FastAPI app factory + route modules (context, ingest, registry, admin) |
storage/ |
DDL, indexes, co-activation graph |
telemetry/ |
OTel metrics, histogram instrumentation |
vault/ |
Obsidian vault export (diagnostic traces) |
launcher/ |
System-tray supervisor |
mcp/ |
MCP tool surface for Claude Code / Desktop |
integrations/ |
ScoreRift bridge |
Back-compat shims: genome.py, ribosome.py, server.py, replication.py, hgt.py re-export from new locations. Lexicon: docs/ROSETTA.md.
IDE + MCP integration
MCP setup (Claude Code / Cursor / Claude Desktop)
{
"mcpServers": {
"helix-context": {
"command": "python",
"args": ["-m", "helix_context.mcp_server"],
"cwd": "/absolute/path/to/your/project",
"env": { "HELIX_MCP_URL": "http://127.0.0.1:11437" }
}
}
}
Continue IDE
models:
- name: Helix (Local)
provider: openai
model: gemma3:e4b
apiBase: http://127.0.0.1:11437/v1
apiKey: EMPTY
roles: [chat]
defaultCompletionOptions:
contextLength: 128000
maxTokens: 4096
OpenAI-compatible proxy (zero code changes)
OPENAI_BASE_URL=http://localhost:11437/v1 your-app
Knowledge store management
[genome]
path = "genomes/main/genome.db" # relative to helix run directory
Backup (safe while running — WAL mode):
cp genomes/main/genome.db backups/genome-$(date +%Y%m%d).db
BGE-M3 backfill (one-time, after install):
python scripts/backfill_bgem3_v2.py genomes/main/genome.db
Observability
scripts\setup-grafana-telem.ps1 # Windows
scripts/setup-grafana-telem.sh # Linux / macOS
Dashboard: http://localhost:3000/d/helix-overview. Full surface: docs/architecture/OBSERVABILITY.md.
Gotchas
- Knowledge store path is
genomes/main/genome.db(not project root). Delete to start fresh. - BGE-M3 backfill is one-time post-install —
embedding_dense_v2 IS NULLuntil you runscripts/backfill_bgem3_v2.py. Low retrieval rate without it. - Fusion mode defaults to
"additive"(back-compat). Flip to"rrf"in[retrieval]after runningscripts/calibrate_thresholds.py. - Session delivery (
session_delivery_enabled = true) tracks delivered docs per session, elides repeats. ~40% token savings on multi-turn. Passignore_delivered: truein/contextbody for benchmarks. - know/miss contract requires the agent prompt fragment to be honored — without it, frontier models confabulate. Import
helix_context.agent_prompt.full_fragment(). - Naming lexicon: biology terms (gene, genome, ribosome) have canonical software equivalents (document, knowledge store, compressor). Both work in code; new code uses software terms. See docs/ROSETTA.md.
Testing
python -m pytest tests/ -m "not live" -v # ~1950 tests, no external services
Documentation
Acknowledgments
Built on: spaCy NER · Howard 2005 TCM · Stachenfeld 2017 SR · SQLite FTS5 BM25 · BGE-M3 · Kompress · Headroom
License
Apache-2.0. See NOTICE for third-party attributions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file helix_context-0.6.2.tar.gz.
File metadata
- Download URL: helix_context-0.6.2.tar.gz
- Upload date:
- Size: 3.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
560e230a6e795afa0f6ef54e8a76c875f29657ec73997874c1ee3a09ed32bc9a
|
|
| MD5 |
7ba5be8f98b225b48441b62e9ec9d651
|
|
| BLAKE2b-256 |
fbf08fbe8f240333aca6591cf3b071ea568681a18c08e924828080d55dc3c04a
|
File details
Details for the file helix_context-0.6.2-py3-none-any.whl.
File metadata
- Download URL: helix_context-0.6.2-py3-none-any.whl
- Upload date:
- Size: 595.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd73ddf3b8e7aafa3039308d974c6065a72d0ae44828ebf1af771504ed5aae41
|
|
| MD5 |
bc0aafa17e2470375bf7e7f072826b5a
|
|
| BLAKE2b-256 |
5907d73f2038e82f3e61a3663a68a359648c253c52b3fa658a25381d6bc9426c
|