Skip to main content

Coordinate index layer for LLM context — Helix weighs, doesn't retrieve

Project description

Helix Context

License: Apache 2.0 PyPI version Python 3.11+ Tests: 1950+ LLM-free pipeline Paper: Agentome

Coordinate-index engine for LLM agents. Retrieves, weighs, and compresses your codebase into a context window — without a single LLM call on the retrieval path.


Proof (30 seconds)

WIP benchmark numbers — compressor disabled (default LLM-free config), N=15 query shapes, May 2026:

metric tokens vs standard RAG (top-5 @ 1500)
median 2,757 2.9× fewer tokens
best (focused query) 1,410 5.7×
worst (broad 12-doc) 3,755 2.1×

With the optional compressor enabled (Claude Haiku splice), median improves to ~5×. In multi-turn sessions, the session delivery register elides already-seen documents — observed 37× reduction on repeated retrievals within a conversation.

Reproducer: python benchmarks/bench_rag_vs_sike_tokens.py against your own genome.

Agent contract: every /context response carries know { found, confidence } (grounded — you may answer) or miss { reason, escalate_to } (not found — don't answer from genome). Stale results downgrade to miss(reason="stale"|"cold"|"superseded") via the freshness gate.

Get started (60 seconds)

# 1. Install
pip install helix-context
python -m spacy download en_core_web_sm

# 2. Ingest your codebase
helix ingest path/to/your/project/ --recursive

# 3. Query it
helix query "how does the splice step work?"

# 4. Or start the proxy for IDE integration
helix-server   # binds to 127.0.0.1:11437

For extras matrix, BGE-M3 backfill, and tray setup: docs/SETUP.md.

Agent surfaces

Three ways to drive Helix — same retrieval primitives, same JSON shapes:

Surface Best for Example
CLI Scripts, CI, cold-start agents helix query "..." --json
MCP Claude Code, Cursor, Claude Desktop Add to settings.json
HTTP proxy Continue IDE, OPENAI_BASE_URL redirect POST /context
# CLI — no server, no daemon, subprocess-drivable
helix query    "what does the splice step do?" --json
helix packet   "edit the splice step" --task-type edit --json
helix gene get abc123 --json
helix neighbors "splice step" --k 10 --json
helix refresh-targets "edit the splice step" --json
helix status
helix diag corpus

Full CLI reference: docs/clients/cli.md. MCP tool schemas: docs/api/mcp-tools.md.

Pipeline (2 minutes)

Seven stages per turn, all LLM-free except optional splice:

  query
    │
    ▼
┌──────────────┐
│ 0. Classify  │  rule-based: decoder mode + assembly cap
└──────┬───────┘
       ▼
┌──────────────┐
│ 1. Extract   │  heuristic keyword + entity extraction
└──────┬───────┘
       ▼
┌──────────────┐  FTS5 BM25 + BGE-M3 dense (1024-dim) + tags
│ 2. Retrieve  │  + synonym expansion + co-activation + SR
│              │  ranked via RRF or additive fusion
└──────┬───────┘
       ▼
┌──────────────┐
│ 3. Re-rank   │  CPU classifier scores (optional)
└──────┬───────┘
       ▼
┌──────────────┐
│ 4. Splice    │  Headroom Kompress (CPU) or LLM compressor
└──────┬───────┘
       ▼
┌──────────────┐  token budget + legibility headers (fired tiers,
│ 5. Assemble  │  confidence ◆/◇/⬦, compression ratio) +
│   + Stage 7  │  freshness gate (stale/cold/superseded → miss)
└──────┬───────┘  + session delivery (elide already-seen docs)
       ▼
┌──────────────┐
│ 6. Persist   │  query+response → knowledge store (background)
└──────┘───────┘
       ▼
   know { } or miss { }
  • know/miss contract: know means the context is grounded, agent may answer. miss means don't answer from genome — escalate via escalate_to tools or refetch from refresh_targets.
  • Caller model class: /context accepts caller_model_class: "generic" | "small_moe" | "frontier" to select render branch (ordering, assembly cap, decoder mode). See docs/api/context-endpoint.md §7.
Configuration (17 sections in helix.toml)
Section Key settings
[ribosome] enabled, backend ("none" / "litellm" / "claude" / "deberta"), query_expansion
[hardware] Device auto-detection (CUDA → ROCm → MPS → CPU)
[budget] expression_tokens (7k default), max_genes_per_turn, splice_aggressiveness, legibility_enabled, session_delivery_enabled
[session] Synthetic session windows, default party_id
[genome] path (genomes/main/genome.db), compact_interval, replicas
[server] host, port, upstream
[headroom] Optional Headroom proxy lifecycle
[ingestion] backend ("cpu" / "ollama"), splade_enabled, entity_graph
[context] Cold-tier retrieval: enabled, k, min_cosine
[cymatics] Frequency-domain scoring, harmonic_links, distance_metric
[classifier] Rule-based query classification thresholds
[retrieval] fusion_mode ("additive" / "rrf"), SR, ray_trace_theta, seeded_edges
[plr] Piecewise linear reranker model
[know] Know/miss calibration: confidence_floor, margin_threshold
[mem_sync] Auto-memory → helix sync: watch_dirs, interval
[synonyms] Query expansion map (e.g., "cache" → ["redis", "ttl"])
[abstain] Low-confidence abstention thresholds

Full reference: docs/config-reference.md.

Full endpoint reference

Core retrieval:

Endpoint Purpose
POST /context know/miss + expressed_context (primary)
POST /context/packet Agent-safe bundle: verified / stale_risk / refresh_targets
POST /context/refresh-plan Refresh targets only (reread plan)
POST /fingerprint Navigation-first payload (scores, no body)
GET /context/expand 1-hop neighborhood from a gene_id
POST /v1/chat/completions OpenAI-compatible proxy

Ingestion + maintenance:

Endpoint Purpose
POST /ingest Add content to the knowledge store
POST /consolidate Rewrite stale docs from source fingerprints
POST /admin/refresh Force retrieval-layer refresh
POST /admin/vacuum Reclaim SQLite pages
POST /admin/swap-db Hot-swap the .db file without restart

Identity + sessions:

Endpoint Purpose
POST /sessions/register Register agent participant
GET /sessions List registered participants
GET /session/{id}/manifest Session delivery log
POST /hitl/emit Record HITL pause event

Diagnostics:

Endpoint Purpose
GET /stats Corpus metrics + compression ratio
GET /health Model, doc count, calibration provenance
GET /genes/{gene_id} Single document detail
GET /debug/resonance Tier activation profile
GET /metrics/tokens Token usage counters

Full schema: docs/api/endpoints.md.

Package structure (16 packages, post-PR #90)
Package Purpose
adapters/ Cache, DAL, external retriever protocol
backends/ Compressor, BGE-M3 codec, DeBERTa, NLI, SEMA, SPLADE
cli/ helix CLI: query, packet, gene, neighbors, ingest, diag, config, status
encoding/ Chunking, fragments, legibility headers, Headroom bridge
identity/ CWoLa logger, session delivery, registry, provenance, claims
pipeline/ Tier logic, stage helpers
retrieval/ Expand, freshness, RRF/additive fusion, PLR, intent router, SR, seeded edges, query classifier
scoring/ Cymatics, know-calibration, know-decision, ray-trace, TCM
server/ FastAPI app factory + route modules (context, ingest, registry, admin)
storage/ DDL, indexes, co-activation graph
telemetry/ OTel metrics, histogram instrumentation
vault/ Obsidian vault export (diagnostic traces)
launcher/ System-tray supervisor
mcp/ MCP tool surface for Claude Code / Desktop
integrations/ ScoreRift bridge

Back-compat shims: genome.py, ribosome.py, server.py, replication.py, hgt.py re-export from new locations. Lexicon: docs/ROSETTA.md.

IDE + MCP integration

MCP setup (Claude Code / Cursor / Claude Desktop)
{
  "mcpServers": {
    "helix-context": {
      "command": "python",
      "args": ["-m", "helix_context.mcp_server"],
      "cwd": "/absolute/path/to/your/project",
      "env": { "HELIX_MCP_URL": "http://127.0.0.1:11437" }
    }
  }
}
Continue IDE
models:
  - name: Helix (Local)
    provider: openai
    model: gemma3:e4b
    apiBase: http://127.0.0.1:11437/v1
    apiKey: EMPTY
    roles: [chat]
    defaultCompletionOptions:
      contextLength: 128000
      maxTokens: 4096
OpenAI-compatible proxy (zero code changes)
OPENAI_BASE_URL=http://localhost:11437/v1 your-app

Knowledge store management

[genome]
path = "genomes/main/genome.db"   # relative to helix run directory

Backup (safe while running — WAL mode):

cp genomes/main/genome.db backups/genome-$(date +%Y%m%d).db

BGE-M3 backfill (one-time, after install):

python scripts/backfill_bgem3_v2.py genomes/main/genome.db

Observability

scripts\setup-grafana-telem.ps1     # Windows
scripts/setup-grafana-telem.sh      # Linux / macOS

Dashboard: http://localhost:3000/d/helix-overview. Full surface: docs/architecture/OBSERVABILITY.md.

Gotchas

  • Knowledge store path is genomes/main/genome.db (not project root). Delete to start fresh.
  • BGE-M3 backfill is one-time post-install — embedding_dense_v2 IS NULL until you run scripts/backfill_bgem3_v2.py. Low retrieval rate without it.
  • Fusion mode defaults to "additive" (back-compat). Flip to "rrf" in [retrieval] after running scripts/calibrate_thresholds.py.
  • Session delivery (session_delivery_enabled = true) tracks delivered docs per session, elides repeats. ~40% token savings on multi-turn. Pass ignore_delivered: true in /context body for benchmarks.
  • know/miss contract requires the agent prompt fragment to be honored — without it, frontier models confabulate. Import helix_context.agent_prompt.full_fragment().
  • Naming lexicon: biology terms (gene, genome, ribosome) have canonical software equivalents (document, knowledge store, compressor). Both work in code; new code uses software terms. See docs/ROSETTA.md.

Testing

python -m pytest tests/ -m "not live" -v   # ~1950 tests, no external services

Documentation

Start here Go deeper
Setup guide Pipeline lanes
Troubleshooting Retrieval dimensions
/context API Knowledge graph
Config reference Session registry
Agent SDK fragment Observability
Operator runbooks Launcher architecture

Acknowledgments

Built on: spaCy NER · Howard 2005 TCM · Stachenfeld 2017 SR · SQLite FTS5 BM25 · BGE-M3 · Kompress · Headroom

License

Apache-2.0. See NOTICE for third-party attributions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helix_context-0.6.2.tar.gz (3.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

helix_context-0.6.2-py3-none-any.whl (595.4 kB view details)

Uploaded Python 3

File details

Details for the file helix_context-0.6.2.tar.gz.

File metadata

  • Download URL: helix_context-0.6.2.tar.gz
  • Upload date:
  • Size: 3.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for helix_context-0.6.2.tar.gz
Algorithm Hash digest
SHA256 560e230a6e795afa0f6ef54e8a76c875f29657ec73997874c1ee3a09ed32bc9a
MD5 7ba5be8f98b225b48441b62e9ec9d651
BLAKE2b-256 fbf08fbe8f240333aca6591cf3b071ea568681a18c08e924828080d55dc3c04a

See more details on using hashes here.

File details

Details for the file helix_context-0.6.2-py3-none-any.whl.

File metadata

  • Download URL: helix_context-0.6.2-py3-none-any.whl
  • Upload date:
  • Size: 595.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for helix_context-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dd73ddf3b8e7aafa3039308d974c6065a72d0ae44828ebf1af771504ed5aae41
MD5 bc0aafa17e2470375bf7e7f072826b5a
BLAKE2b-256 5907d73f2038e82f3e61a3663a68a359648c253c52b3fa658a25381d6bc9426c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page