Skip to main content

The Context Optimization Layer for LLM Applications - Cut costs by 50-90%

Project description

Headroom

Compress everything your AI agent reads. Same answers, fraction of the tokens.

CI codecov PyPI npm Model: Kompress-base Tokens saved: 60B+ License: Apache 2.0 Docs

Headroom in action

Every tool call, log line, DB read, RAG chunk, and file your agent injects into a prompt is mostly boilerplate. Headroom strips the noise and keeps the signal — losslessly, locally, and without touching accuracy.

100 logs. One FATAL error buried at position 67. Both runs found it. Baseline 10,144 tokens → Headroom 1,260 tokens87% fewer, identical answer. python examples/needle_in_haystack_test.py


Quick start

Works with Anthropic, OpenAI, Google, Bedrock, Vertex, Azure, OpenRouter, and 100+ models via LiteLLM.

Wrap your coding agent — one command:

pip install "headroom-ai[all]"

headroom wrap claude      # Claude Code
headroom wrap codex       # Codex
headroom wrap cursor      # Cursor
headroom wrap aider       # Aider
headroom wrap copilot     # GitHub Copilot CLI

Drop it into your own code — Python or TypeScript:

from headroom import compress

result = compress(messages, model="claude-sonnet-4-5")
response = client.messages.create(model="claude-sonnet-4-5", messages=result.messages)
print(f"Saved {result.tokens_saved} tokens ({result.compression_ratio:.0%})")
import { compress } from 'headroom-ai';
const result = await compress(messages, { model: 'gpt-4o' });

Or run it as a proxy — zero code changes, any language:

headroom proxy --port 8787
ANTHROPIC_BASE_URL=http://localhost:8787 your-app
OPENAI_BASE_URL=http://localhost:8787/v1 your-app

Why Headroom

  • Accuracy-preserving. GSM8K 0.870 → 0.870 (±0.000). TruthfulQA +0.030. SQuAD v2 and BFCL both 97% accuracy after compression. Validated on public OSS benchmarks you can rerun yourself.
  • Runs on your machine. No cloud API, no data egress. Compression latency is milliseconds — faster end-to-end for Sonnet / Opus / GPT-4 class models than a hosted service round-trip.
  • Kompress-base on HuggingFace. Our open-source text compressor, fine-tuned on real agentic traces — tool outputs, logs, RAG chunks, code. Install with pip install "headroom-ai[ml]".
  • Cross-agent memory and learning. Claude Code saves a fact, Codex reads it back. headroom learn mines failed sessions and writes corrections straight to CLAUDE.md / AGENTS.md / GEMINI.md — reliability compounds over time.
  • Reversible (CCR). Compression is not deletion. The model can always call headroom_retrieve to pull the original bytes. Nothing is thrown away.

Bundles the RTK binary for shell-output rewriting — full attribution below.


How it fits

 Your agent / app
   (Claude Code, Cursor, Codex, LangChain, Agno, Strands, your own code…)
        │   prompts · tool outputs · logs · RAG results · files
        ▼
    ┌────────────────────────────────────────────────────┐
    │  Headroom   (runs locally — your data stays here)  │
    │  ───────────────────────────────────────────────   │
    │  CacheAligner  →  ContentRouter  →  CCR             │
    │                    ├─ SmartCrusher   (JSON)         │
    │                    ├─ CodeCompressor (AST)          │
    │                    └─ Kompress-base  (text, HF)     │
    │                                                     │
    │  Cross-agent memory  ·  headroom learn  ·  MCP      │
    └────────────────────────────────────────────────────┘
        │   compressed prompt  +  retrieval tool
        ▼
 LLM provider  (Anthropic · OpenAI · Bedrock · …)

Architecture · CCR reversible compression · Kompress-base model card

Canonical pipeline lifecycle

Headroom now exposes one stable request lifecycle across compress(), the SDK, and the proxy:

SetupPre-StartPost-StartInput ReceivedInput CachedInput RoutedInput CompressedInput RememberedPre-SendPost-SendResponse Received

  • Transforms still do the work: CacheAligner, ContentRouter, SmartCrusher, CodeCompressor, Kompress-base, IntelligentContext / RollingWindow.
  • Pipeline extensions observe or customize those lifecycle stages via on_pipeline_event(...).
  • Compression hooks still work and now sit alongside the canonical lifecycle instead of being the only extension seam.
  • Proxy extensions remain the server/app integration seam for ASGI middleware, routes, and startup policy.

Provider slices

Provider and tool-specific behavior is being moved behind dedicated modules under headroom/providers/ so core orchestration stays focused on lifecycle, sequencing, and policy.

  • CLI/tool slices: headroom/providers/claude, copilot, codex, openclaw
  • Provider runtime slices: headroom/providers/claude, gemini, plus shared backend/runtime dispatch in headroom/providers/registry.py
  • Core files stay orchestration-first: wrap.py, client.py, cli/proxy.py, and proxy/server.py now delegate provider-specific env shaping, API target normalization, backend selection, and transport dispatch instead of inlining those rules.

Proof

Savings on real agent workloads:

Workload Before After Savings
Code search (100 results) 17,765 1,408 92%
SRE incident debugging 65,694 5,118 92%
GitHub issue triage 54,174 14,761 73%
Codebase exploration 78,502 41,254 47%

Accuracy preserved on standard benchmarks:

Benchmark Category N Baseline Headroom Delta
GSM8K Math 100 0.870 0.870 ±0.000
TruthfulQA Factual 100 0.530 0.560 +0.030
SQuAD v2 QA 100 97% 19% compression
BFCL Tools 100 97% 32% compression

Reproduce:

python -m headroom.evals suite --tier 1

Community, live:

Full benchmarks & methodology


Built for coding agents

Agent One-command wrap Notes
Claude Code headroom wrap claude --memory for cross-agent memory, --code-graph for codebase intel
Codex headroom wrap codex --memory Shares the same memory store as Claude
Cursor headroom wrap cursor Prints Cursor config — paste once, done
Aider headroom wrap aider Starts proxy, launches Aider
Copilot CLI headroom wrap copilot Starts proxy, launches Copilot
OpenClaw headroom wrap openclaw Installs Headroom as ContextEngine plugin

MCP-native too — headroom mcp install exposes headroom_compress, headroom_retrieve, and headroom_stats to any MCP client.

headroom learn in action

Integrations

Drop Headroom into any stack
Your setup Hook in with
Any Python app compress(messages, model=…)
Any TypeScript app await compress(messages, { model })
Anthropic / OpenAI SDK withHeadroom(new Anthropic()) · withHeadroom(new OpenAI())
Vercel AI SDK wrapLanguageModel({ model, middleware: headroomMiddleware() })
LiteLLM litellm.callbacks = [HeadroomCallback()]
LangChain HeadroomChatModel(your_llm)
Agno HeadroomAgnoModel(your_model)
Strands Strands guide
ASGI apps app.add_middleware(CompressionMiddleware)
Multi-agent SharedContext().put / .get
MCP clients headroom mcp install
What's inside
  • SmartCrusher — universal JSON: arrays of dicts, nested objects, mixed types.
  • CodeCompressor — AST-aware for Python, JS, Go, Rust, Java, C++.
  • Kompress-base — our HuggingFace model, trained on agentic traces.
  • Image compression — 40–90% reduction via trained ML router.
  • CacheAligner — stabilizes prefixes so Anthropic/OpenAI KV caches actually hit.
  • IntelligentContext — score-based context fitting with learned importance.
  • CCR — reversible compression; LLM retrieves originals on demand.
  • Cross-agent memory — shared store, agent provenance, auto-dedup.
  • SharedContext — compressed context passing across multi-agent workflows.
  • headroom learn — plugin-based failure mining for Claude, Codex, Gemini.

Install

pip install "headroom-ai[all]"          # Python, everything
npm  install headroom-ai                # TypeScript / Node
docker pull ghcr.io/chopratejas/headroom:latest

Granular extras: [proxy], [mcp], [ml] (Kompress-base), [agno], [langchain], [evals]. Requires Python 3.10+.

Installation guide — Docker tags, persistent service, PowerShell, devcontainers.


Documentation

Start here Go deeper
Quickstart Architecture
Proxy How compression works
MCP tools CCR — reversible compression
Memory Cache optimization
Failure learning Benchmarks
Configuration Limitations

Compared to

Headroom runs locally, covers every content type (not just CLI or text), works with every major framework, and is reversible.

Scope Deploy Local Reversible
Headroom All context — tools, RAG, logs, files, history Proxy · library · middleware · MCP Yes Yes
RTK CLI command outputs CLI wrapper Yes No
Compresr, Token Co. Text sent to their API Hosted API call No No
OpenAI Compaction Conversation history Provider-native No No

Attribution. Headroom ships with the excellent RTK binary for shell-output rewriting — git showgit show --short, noisy ls → scoped, chatty installers → summarized. Huge thanks to the RTK team; their tool is a first-class part of our stack, and Headroom compresses everything downstream of it.


Contributing

git clone https://github.com/chopratejas/headroom.git && cd headroom
pip install -e ".[dev]" && pytest

Devcontainers in .devcontainer/ (default + memory-stack with Qdrant & Neo4j). See CONTRIBUTING.md.


Community

License

Apache 2.0 — see LICENSE.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

headroom_ai-0.20.26-cp313-cp313-manylinux_2_28_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

headroom_ai-0.20.26-cp313-cp313-manylinux_2_28_aarch64.whl (16.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

headroom_ai-0.20.26-cp313-cp313-macosx_11_0_arm64.whl (13.8 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

headroom_ai-0.20.26-cp312-cp312-manylinux_2_28_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

headroom_ai-0.20.26-cp312-cp312-manylinux_2_28_aarch64.whl (16.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

headroom_ai-0.20.26-cp312-cp312-macosx_11_0_arm64.whl (13.8 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

headroom_ai-0.20.26-cp311-cp311-manylinux_2_28_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

headroom_ai-0.20.26-cp311-cp311-manylinux_2_28_aarch64.whl (16.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

headroom_ai-0.20.26-cp311-cp311-macosx_11_0_arm64.whl (13.8 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

headroom_ai-0.20.26-cp310-cp310-manylinux_2_28_x86_64.whl (15.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

headroom_ai-0.20.26-cp310-cp310-manylinux_2_28_aarch64.whl (16.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

headroom_ai-0.20.26-cp310-cp310-macosx_11_0_arm64.whl (13.8 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file headroom_ai-0.20.26-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6b32c59536dfb5cfa8451fbdb5e0d1988cf3271748b2481b4ac0ced52bbffd14
MD5 69960de19e87c04c77d82cbf3587bdaa
BLAKE2b-256 9876191937d367d84615cfe5fe72efe02c4464670228db157a3ff57a96c0f069

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5824a67bd011320bdd6cdfb5bf80359f281d5ec452489ac789e8dad42ff2ae54
MD5 258844262416e78c0ef4faaf29656eb1
BLAKE2b-256 0b9171cbac338aa55ea42714b1ad1fa6dbdd9c2ba5836be24d48e0dcd56e96a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ffaa578a2fe23a33ac2896371dbdb3f0956ba3305ab077fe4789f6d37d5578b3
MD5 31e7b43bb1fd070164ad4ad4e2b6b22b
BLAKE2b-256 a156961aaa1326a8e499b34e4a57e4114ac8acc8e4746cfd6d5b71e9ec01d0f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3bb35a48264f8b3f3b666eb37db171e8f8fe5721ba7c39eed54f0e2424e37773
MD5 3cb7b03e4ea75d04bb9168a83db1a97f
BLAKE2b-256 a22bf65b5bc3cbbc4c06ae1731471c8a1931860fd978db23128826578c35dbc2

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5065fa6fe66e3ef9dbb53d56046a63cbc5e3c01814550c2898e2767cf349de58
MD5 1ea4af6b87aee8e0d988267b2b009038
BLAKE2b-256 85e357e65c7aee02e10f2fe3d7370293bd0cfb94aafc6cfb6dc1a57028e42554

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c662cc3bdf75510004eb5aab433e3e86a65da8210ce18e17687cea1921b1f450
MD5 ba1d847e07a771641de08145c9beebae
BLAKE2b-256 e480eefea9f13c9bf4273cc83dfde2f9528f1e2b2d3e50c5c39f1c5b5ad77648

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4d6e3485a802e3e8c616096b8943fc43de9ea3fee9a896d0ce3fbc74a1e55a55
MD5 9597482e47c7c54cf69902140753dda7
BLAKE2b-256 3e4099a594c05da82b2b429e47b88aeff34ca0ab2d374efaa8598072c19a39be

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8a7f492f391c64c91025e93f19240d0552b994772f8cb9838e68f23702db3fc4
MD5 d8178cd21462518b3e2e6ee68a9c8027
BLAKE2b-256 d15d189a2d8a41308f87f642db45a1a8c916799cc5bd917945c7bf04953326a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ffaf12b8de245ed2426b427701c8167c891665eca6954bb92683867687a0d54c
MD5 9ba34bae466f41fbeace9b09f46ca022
BLAKE2b-256 b14ea810f5754151a91dc0ab50b70c201dbc22553905f1a2de09b31fd23a6a32

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 250de1ca865317db4610ceaf284d7840bc54c9a88e04c031de0227eabafcafe4
MD5 d1cc22910134ceb80f14f8dc8183542c
BLAKE2b-256 794aebf4b1f0b837760430bb81aa09d6a62daa548b753f5a2678553c4a125f88

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8ae3d8d94ffea4c03fee0a6eb1e60019d12b9e04ffdbc7ddae5cfe7c2d3c3214
MD5 cbb8d38335961a4b5389fd01767582f2
BLAKE2b-256 30d56e684c9d69d79c9d0dd3ad4b1d1765d762c931d67403eeb351aa94ca779b

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp310-cp310-manylinux_2_28_aarch64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headroom_ai-0.20.26-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for headroom_ai-0.20.26-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7671d869cb42ad5bd0456e2068f2fae1a991b63a887922f25e5d412255e2aabb
MD5 5eaa9d34ed3b9a5ebbc8bbb9e4b81048
BLAKE2b-256 7cda1916f1df2b99698fb7109c053721c78f385febf21d552b31bd5e4caf666a

See more details on using hashes here.

Provenance

The following attestation bundles were made for headroom_ai-0.20.26-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: release.yml on chopratejas/headroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page