Skip to main content

The Context Optimization Layer for LLM Applications - Cut costs by 50-90%

Project description

  ██╗  ██╗███████╗ █████╗ ██████╗ ██████╗  ██████╗  ██████╗ ███╗   ███╗
  ██║  ██║██╔════╝██╔══██╗██╔══██╗██╔══██╗██╔═══██╗██╔═══██╗████╗ ████║
  ███████║█████╗  ███████║██║  ██║██████╔╝██║   ██║██║   ██║██╔████╔██║
  ██╔══██║██╔══╝  ██╔══██║██║  ██║██╔══██╗██║   ██║██║   ██║██║╚██╔╝██║
  ██║  ██║███████╗██║  ██║██████╔╝██║  ██║╚██████╔╝╚██████╔╝██║ ╚═╝ ██║
  ╚═╝  ╚═╝╚══════╝╚═╝  ╚═╝╚═════╝ ╚═╝  ╚═╝ ╚═════╝  ╚═════╝ ╚═╝     ╚═╝
                  The context compression layer for AI agents

60–95% fewer tokens · library · proxy · MCP · 6 algorithms · local-first · reversible

CI codecov PyPI npm Model: Kompress-base Tokens saved: 60B+ License: Apache 2.0 Docs

Docs · Install · Proof · Agents · Discord · llms.txt

AI agents / LLMs: read /llms.txt here, or fetch the live index / full docs blob.


Headroom compresses everything your AI agent reads — tool outputs, logs, RAG chunks, files, and conversation history — before it reaches the LLM. Same answers, fraction of the tokens.

Headroom in action
Live: 10,144 → 1,260 tokens — same FATAL found.

What it does

  • Librarycompress(messages) in Python or TypeScript, inline in any app
  • Proxyheadroom proxy --port 8787, zero code changes, any language
  • Agent wrapheadroom wrap claude|codex|cursor|aider|copilot in one command
  • MCP serverheadroom_compress, headroom_retrieve, headroom_stats for any MCP client
  • Cross-agent memory — shared store across Claude, Codex, Gemini, auto-dedup
  • headroom learn — mines failed sessions, writes corrections to CLAUDE.md / AGENTS.md
  • Reversible (CCR) — originals never deleted; LLM retrieves on demand

How it works (30 seconds)

 Your agent / app
   (Claude Code, Cursor, Codex, LangChain, Agno, Strands, your own code…)
        │   prompts · tool outputs · logs · RAG results · files
        ▼
    ┌────────────────────────────────────────────────────┐
    │  Headroom   (runs locally — your data stays here)  │
    │  ───────────────────────────────────────────────   │
    │  CacheAligner  →  ContentRouter  →  CCR             │
    │                    ├─ SmartCrusher   (JSON)         │
    │                    ├─ CodeCompressor (AST)          │
    │                    └─ Kompress-base  (text, HF)     │
    │                                                     │
    │  Cross-agent memory  ·  headroom learn  ·  MCP      │
    └────────────────────────────────────────────────────┘
        │   compressed prompt  +  retrieval tool
        ▼
 LLM provider  (Anthropic · OpenAI · Bedrock · …)
  • ContentRouter — detects content type, selects the right compressor
  • SmartCrusher / CodeCompressor / Kompress-base — compress JSON, AST, or prose
  • CacheAligner — stabilizes prefixes so provider KV caches actually hit
  • CCR — stores originals locally; LLM calls headroom_retrieve if it needs them

Architecture · CCR reversible compression · Kompress-base model card

Get started (60 seconds)

# 1 — Install
pip install "headroom-ai[all]"          # Python
npm install headroom-ai                 # Node / TypeScript

# 2 — Pick your mode
headroom wrap claude                    # wrap a coding agent
headroom proxy --port 8787              # drop-in proxy, zero code changes
# or: from headroom import compress      # inline library

# 3 — See the savings
headroom stats

Granular extras: [proxy], [mcp], [ml], [agno], [langchain], [evals]. Requires Python 3.10+.

Proof

Savings on real agent workloads:

Workload Before After Savings
Code search (100 results) 17,765 1,408 92%
SRE incident debugging 65,694 5,118 92%
GitHub issue triage 54,174 14,761 73%
Codebase exploration 78,502 41,254 47%

Accuracy preserved on standard benchmarks:

Benchmark Category N Baseline Headroom Delta
GSM8K Math 100 0.870 0.870 ±0.000
TruthfulQA Factual 100 0.530 0.560 +0.030
SQuAD v2 QA 100 97% 19% compression
BFCL Tools 100 97% 32% compression

Reproduce: python -m headroom.evals suite --tier 1 · Full benchmarks & methodology

60B+ tokens saved — community leaderboard
60B+ tokens saved by the community — live leaderboard →

Agent compatibility matrix

Agent headroom wrap Notes
Claude Code --memory · --code-graph
Codex shares memory with Claude
Cursor prints config — paste once
Aider starts proxy + launches
Copilot CLI starts proxy + launches
OpenClaw installs as ContextEngine plugin

Any OpenAI-compatible client works via headroom proxy. MCP-native: headroom mcp install.

When to use · When to skip

Great fit if you…

  • run AI coding agents daily and want savings without changing your code
  • work across multiple agents and want shared memory
  • need reversible compression — originals always retrievable via CCR

Skip it if you…

  • only use a single provider's native compaction and don't need cross-agent memory
  • work in a sandboxed environment where local processes can't run
Integrations — drop Headroom into any stack
Your setup Hook in with
Any Python app compress(messages, model=…)
Any TypeScript app await compress(messages, { model })
Anthropic / OpenAI SDK withHeadroom(new Anthropic()) · withHeadroom(new OpenAI())
Vercel AI SDK wrapLanguageModel({ model, middleware: headroomMiddleware() })
LiteLLM litellm.callbacks = [HeadroomCallback()]
LangChain HeadroomChatModel(your_llm)
Agno HeadroomAgnoModel(your_model)
Strands Strands guide
ASGI apps app.add_middleware(CompressionMiddleware)
Multi-agent SharedContext().put / .get
MCP clients headroom mcp install
What's inside
  • SmartCrusher — universal JSON: arrays of dicts, nested objects, mixed types.
  • CodeCompressor — AST-aware for Python, JS, Go, Rust, Java, C++.
  • Kompress-base — our HuggingFace model, trained on agentic traces.
  • Image compression — 40–90% reduction via trained ML router.
  • CacheAligner — stabilizes prefixes so Anthropic/OpenAI KV caches actually hit.
  • IntelligentContext — score-based context fitting with learned importance.
  • CCR — reversible compression; LLM retrieves originals on demand.
  • Cross-agent memory — shared store, agent provenance, auto-dedup.
  • SharedContext — compressed context passing across multi-agent workflows.
  • headroom learn — plugin-based failure mining for Claude, Codex, Gemini.
Pipeline internals

Headroom exposes one stable request lifecycle across compress(), the SDK, and the proxy:

SetupPre-StartPost-StartInput ReceivedInput CachedInput RoutedInput CompressedInput RememberedPre-SendPost-SendResponse Received

  • Transforms do the work: CacheAligner, ContentRouter, SmartCrusher, CodeCompressor, Kompress-base, IntelligentContext / RollingWindow.
  • Pipeline extensions observe or customize lifecycle stages via on_pipeline_event(...).
  • Compression hooks sit alongside the canonical lifecycle as an additional extension seam.
  • Proxy extensions remain the server/app integration seam for ASGI middleware, routes, and startup policy.

Provider and tool-specific behavior lives under headroom/providers/ so core orchestration stays focused on lifecycle, sequencing, and policy.

  • CLI/tool slices: headroom/providers/claude, copilot, codex, openclaw
  • Provider runtime slices: headroom/providers/claude, gemini, plus shared backend/runtime dispatch in headroom/providers/registry.py
  • Core files stay orchestration-first: wrap.py, client.py, cli/proxy.py, and proxy/server.py delegate provider-specific env shaping, API target normalization, backend selection, and transport dispatch.

Install

pip install "headroom-ai[all]"          # Python, everything
npm install headroom-ai                 # TypeScript / Node
docker pull ghcr.io/chopratejas/headroom:latest

Granular extras: [proxy], [mcp], [ml] (Kompress-base), [agno], [langchain], [evals]. Requires Python 3.10+.

Using pipx? Choose a supported interpreter explicitly:

pipx install --python python3.13 "headroom-ai[all]"

Installation guide — Docker tags, persistent service, PowerShell, devcontainers.

headroom learn

headroom learn in action

headroom learn — mines failed sessions, writes corrections to CLAUDE.md / AGENTS.md / GEMINI.md.

Documentation

Start here Go deeper
Quickstart Architecture
Proxy How compression works
MCP tools CCR — reversible compression
Memory Cache optimization
Failure learning Benchmarks
Configuration Limitations

Compared to

Headroom runs locally, covers every content type, works with every major framework, and is reversible.

Scope Deploy Local Reversible
Headroom All context — tools, RAG, logs, files, history Proxy · library · middleware · MCP Yes Yes
RTK CLI command outputs CLI wrapper Yes No
lean-ctx CLI commands, MCP tools, editor rules CLI wrapper · MCP Yes No
Compresr, Token Co. Text sent to their API Hosted API call No No
OpenAI Compaction Conversation history Provider-native No No

Attribution. Headroom ships with the excellent RTK binary for shell-output rewriting — git show --short, scoped ls, summarized installers. Huge thanks to the RTK team; their tool is a first-class part of our stack, and Headroom compresses everything downstream of it. Headroom can also use lean-ctx as the selected CLI context tool; set HEADROOM_CONTEXT_TOOL=lean-ctx before running headroom wrap ....

Contributing

git clone https://github.com/chopratejas/headroom.git && cd headroom
pip install -e ".[dev]" && pytest

Devcontainers in .devcontainer/ (default + memory-stack with Qdrant & Neo4j). See CONTRIBUTING.md.

Community

License

Apache 2.0 — see LICENSE.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

headroom_ai-0.21.39-cp312-cp312-manylinux_2_28_aarch64.whl (18.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

headroom_ai-0.21.39-cp312-cp312-macosx_11_0_arm64.whl (16.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

headroom_ai-0.21.39-cp311-cp311-manylinux_2_28_x86_64.whl (17.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

headroom_ai-0.21.39-cp311-cp311-manylinux_2_28_aarch64.whl (18.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

headroom_ai-0.21.39-cp311-cp311-macosx_11_0_arm64.whl (16.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

headroom_ai-0.21.39-cp310-cp310-manylinux_2_28_x86_64.whl (17.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

headroom_ai-0.21.39-cp310-cp310-manylinux_2_28_aarch64.whl (18.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

headroom_ai-0.21.39-cp310-cp310-macosx_11_0_arm64.whl (16.0 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file headroom_ai-0.21.39-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.21.39-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 cdf861bce669cb41996694069d21f4a42bb2c41c26c170535d4fa07d9de561d2
MD5 de92a5e0388b6688d9a512e581a9a5c5
BLAKE2b-256 2458c0986f4dd6c9eafcad290caa4fe7dc6372ba5dc3ecdfe4e7fa9c94fb38ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.21.39-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.21.39-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.21.39-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1b55875f63e27f059f65365a927d388ff6f56dd32f471d9f1b328011098f2aea
MD5 5548032de48530fb7ba9b75c16b0068b
BLAKE2b-256 cdfe9874c65b5f66f1dbc8cdf5ca0977e3f25e6e3ccbee091bd0849dad6bd041

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.21.39-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.21.39-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.21.39-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 11e94b799e65ec3b636afefb3e869a656c472f6cd4460b769b3fc0be7beeec5e
MD5 614f271bc19b2461535c37078d905882
BLAKE2b-256 3802255fa8ad617ac04ee4598f2d65232b87a8c44b53188d389e4dfdc829e9a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.21.39-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.21.39-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.21.39-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6aa5b110c614efa9f44e74d7378577981c9b1e45c53713d8fd70b96872e90cd5
MD5 9d6894b81ccaf23555b46e216bf6341c
BLAKE2b-256 f97ad50761da342dc17c791ab769135d0c58e48c529f16f69c353663f98ca8a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.21.39-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.21.39-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.21.39-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1771da2ce80ae012770e6fce397ae88eceb5b118fea3a5b7783411bb6c8798c6
MD5 08627abd98e6788916d51ce9b0421c13
BLAKE2b-256 b8188a2dc832ea9fcb9682bf4c2725c5b1d33630abbf516dbbd96cd8a1e2b8cc

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.21.39-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.21.39-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.21.39-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3df47906b52aea455c0ccda49d65eaec5128fc5d45231286952091d7e3ebd173
MD5 00c9b54dc84efea869c7ccd07a2dbbf2
BLAKE2b-256 5c9e269bd09d547b7c2b98c065c5ec6a8e262966f1bb72f7334873dc77bc0075

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.21.39-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.21.39-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.21.39-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a67e534c981c7508dddf9cd2665ccfcb54753cd652a57529074a9353857dedd3
MD5 aca6d9c7f8930a50d3354f9123af38e9
BLAKE2b-256 81acc7e2392b0bc398581562ec7bb295fb085ad94a5bbd68073fb9024dbea6e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.21.39-cp310-cp310-manylinux_2_28_aarch64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.21.39-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.21.39-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eccbe7a11c6034d25aef64e4459ca72ce7113ec1f51d57d355f9e048c9150790
MD5 177ec5b11ae627c5f1499107b0cba6ae
BLAKE2b-256 7ca2e432a702aba1fde48a487dc7cd373eb735d8b428fe6a34e05469e7190103

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.21.39-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page