Skip to main content

Genome-based context compression for local LLMs

Project description

Helix Context

Genome-based context compression for local LLMs.

Makes 9k tokens of context window feel like 600k by treating context like a genome instead of a flat text buffer.

  Client (Continue, Cursor, any OpenAI client)
         |
         v
  +--------------------------+
  |  Helix Proxy (FastAPI)   |  Port 11437
  |  /v1/chat/completions    |  OpenAI-compatible
  |                          |
  |  1. Extract query        |
  |  2. Express pipeline     |  <-- Genome (SQLite)
  |  3. Inject context       |  <-- Ribosome (CPU model)
  |  4. Forward to Ollama    |  --> localhost:11434
  |  5. Stream tee response  |
  |  6. Background replicate |
  +--------------------------+

Instead of stuffing your entire codebase into the prompt, Helix compresses it into a persistent SQLite genome and expresses only the relevant genes per turn. The model sees compressed context, not raw text. Conversations replicate back into the genome automatically, building institutional memory over time.

Quick Start

# Install from PyPI (beta)
pip install helix-context --pre

# Pull a small model for the ribosome (context codec)
ollama pull gemma4:e2b

# Start the proxy
helix
# or: python -m uvicorn helix_context.server:app --host 127.0.0.1 --port 11437

# Seed the genome with your own project files
python examples/seed_genome.py path/to/your/project/

# Check genome health
curl http://127.0.0.1:11437/stats

Point any OpenAI-compatible client at http://127.0.0.1:11437/v1 and start chatting. Context compression happens transparently.

How It Works

6-step expression pipeline per turn:

Step What Cost Blocking?
1. Extract Heuristic keyword extraction from query 0 tokens No
2. Express SQLite promoter lookup + synonym expansion + co-activation 0 tokens No
3. Re-rank Small CPU model scores candidates by relevance ~300 tokens Yes
4. Splice Small CPU model trims introns, keeps exons (batched) ~600 tokens Yes
5. Assemble Join spliced parts, enforce token budget, wrap in tags 0 tokens No
6. Replicate Pack query+response exchange back into genome ~300 tokens No (background)

Token budget:

  • 3k tokens: ribosome decoder prompt (fixed, tells the big model how to read codons)
  • 6k tokens: expressed context (compressed, spliced)
  • 600k+: genome cold storage (SQLite, never fully loaded)

Key Features

Context Health Monitor (Delta-Epsilon)

Every query computes a health signal measuring how well the genome served it:

{
  "context_health": {
    "ellipticity": 0.82,
    "coverage": 0.75,
    "density": 0.68,
    "freshness": 1.0,
    "genes_expressed": 3,
    "genes_available": 42,
    "status": "aligned"
  }
}
Status Ellipticity Meaning
aligned >= 0.7 Genome is well-grounded, model is informed
sparse >= 0.3 Gaps exist, model may guess on some topics
stale any Expressed genes are outdated (low freshness)
denatured < 0.3 Context is unreliable, high hallucination risk

Horizontal Gene Transfer (HGT)

Export a genome and import it into another Helix instance:

# Export
python examples/hgt_transfer.py export -d "Project knowledge snapshot"

# Preview what an import would change
python examples/hgt_transfer.py diff genome_export.helix

# Import into another instance
python examples/hgt_transfer.py import genome_export.helix

Three merge strategies: skip_existing (safe default), overwrite, newest. Content-addressed gene IDs ensure deduplication across instances.

Associative Memory

Genes that are frequently expressed together build co-activation links. When you query for topic A, the genome also pulls in topic B if they've been co-expressed before. This creates an organic associative memory that grows smarter over time.

Synonym Expansion

Configure lightweight query expansion in helix.toml:

[synonyms]
cache = ["redis", "ttl", "invalidation", "cdn"]
auth = ["jwt", "login", "security", "token"]

When a user asks about "cache", the genome also searches for "redis", "ttl", etc.

HTTP Endpoints

Endpoint Method Description
/v1/chat/completions POST OpenAI-compatible proxy (primary integration)
/ingest POST Ingest content into genome: {content, content_type, metadata?}
/context POST Query genome for context: {query} (Continue format)
/stats GET Genome metrics, compression ratio, health
/health GET Server status, ribosome model, gene count

Continue IDE Integration

Add to ~/.continue/config.yaml:

models:
  - name: Helix (Local)
    provider: openai
    model: gemma4:e4b
    apiBase: http://127.0.0.1:11437/v1
    apiKey: EMPTY
    roles: [chat]
    defaultCompletionOptions:
      contextLength: 128000
      maxTokens: 4096

Use Chat mode (not Agent mode). Set contextLength high so Continue sends the full message; Helix handles compression downstream.

Python API

from helix_context import HelixContextManager, load_config

config = load_config()
helix = HelixContextManager(config)

# Ingest content
helix.ingest("Your document text here", content_type="text")
helix.ingest(open("src/main.py").read(), content_type="code")

# Build context for a query
window = helix.build_context("How does auth work?")
print(window.expressed_context)
print(window.context_health.status)  # "aligned" / "sparse" / "denatured"

# Learn from an exchange
helix.learn("How does auth work?", "JWT middleware validates tokens...")

# Export genome
from helix_context.hgt import export_genome
export_genome(helix.genome, "project.helix", description="Auth system knowledge")

ScoreRift Integration

Helix includes a bridge to ScoreRift for divergence-based context health monitoring:

from helix_context.integrations.scorerift import GenomeHealthProbe, cd_signal

# Probe genome health
probe = GenomeHealthProbe("http://127.0.0.1:11437")
report = probe.full_scan()

# Register as ScoreRift dimensions
from helix_context.integrations.scorerift import make_genome_dimensions
engine.register_many(make_genome_dimensions())

# Feed divergence resolutions back into the genome
from helix_context.integrations.scorerift import resolution_to_gene
resolution_to_gene("security", auto_score=0.85, manual_score=1.0,
                   resolution="False positives in auth module scanner rules")

Configuration

All config in helix.toml:

[ribosome]
model = "auto"              # auto-detect from Ollama
timeout = 10                # seconds before fallback
keep_alive = "30m"          # keep model loaded (eliminates swap latency)

[budget]
ribosome_tokens = 3000
expression_tokens = 6000
max_genes_per_turn = 8
splice_aggressiveness = 0.5  # 0=keep all, 1=ruthless trim

[genome]
path = "genome.db"
cold_start_threshold = 10   # genes needed before history stripping

[server]
port = 11437
upstream = "http://localhost:11434"

[synonyms]
cache = ["redis", "ttl", "invalidation", "cdn"]
auth = ["jwt", "login", "security", "token"]

Testing

# Mock tests only (no Ollama needed, ~8s)
pytest tests/ -m "not live"

# Live tests (requires Ollama)
pytest tests/ -m live -v -s

# Full suite
pytest tests/ -v

183 tests across 7 test files, 18 diverse fixtures (code, essays, poems, science).

Architecture

Module Role
schemas.py Gene, ContextWindow, ContextHealth, ChromatinState
codons.py CodonChunker (text/code splitting) + CodonEncoder (serialization)
genome.py SQLite genome with promoter-tag retrieval + co-activation
ribosome.py Small-model codec: pack, re_rank, splice, replicate
context_manager.py 6-step pipeline orchestrator + pending replication buffer
server.py FastAPI proxy + standalone endpoints
config.py TOML config loader with synonym map
hgt.py Genome export/import (Horizontal Gene Transfer)
integrations/scorerift.py CD spectroscope bridge to ScoreRift

Origin

Built as a standalone package extracted from BigEd CC. Implements the "Ribosome Hypothesis" for local LLM context management.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helix_context-0.1.0b2.tar.gz (26.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

helix_context-0.1.0b2-py3-none-any.whl (98.4 kB view details)

Uploaded Python 3

File details

Details for the file helix_context-0.1.0b2.tar.gz.

File metadata

  • Download URL: helix_context-0.1.0b2.tar.gz
  • Upload date:
  • Size: 26.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for helix_context-0.1.0b2.tar.gz
Algorithm Hash digest
SHA256 de9f3ec3522d1b4906b1c9f33e1ac92b94dc8d0d00a7198298912fc4fd6c9661
MD5 9e40eb10fb620d57480a9e0294042ed7
BLAKE2b-256 c111fa48600302f9e3391a521c967e90eb1b4783c0ca819a9bfe3a9c88636af7

See more details on using hashes here.

Provenance

The following attestation bundles were made for helix_context-0.1.0b2.tar.gz:

Publisher: publish.yml on SwiftWing21/helix-context

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file helix_context-0.1.0b2-py3-none-any.whl.

File metadata

  • Download URL: helix_context-0.1.0b2-py3-none-any.whl
  • Upload date:
  • Size: 98.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for helix_context-0.1.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 7d554aa92ed700a5f66f784c97eebf6936a9940d8fa59b58221512a3c5b9fd1e
MD5 ccc49b69eedbe43cc1cd7b0c6651f66b
BLAKE2b-256 291bb586d3cf28287a2b5ef02f243789322ecf34644498722e723bd2162872c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for helix_context-0.1.0b2-py3-none-any.whl:

Publisher: publish.yml on SwiftWing21/helix-context

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page