The Context Optimization Layer for LLM Applications - Cut costs by 50-90%
Project description
Headroom
The Context Optimization Layer for LLM Applications
Cut your LLM costs by 50-90% without losing accuracy
What It Does
Headroom is a smart compression proxy for LLM applications:
- Compresses tool outputs — 1000 search results → 15 items (keeps errors, anomalies, relevant items)
- Enables provider caching — Stabilizes prefixes so cache hits actually happen
- Manages context windows — Prevents token limit failures without breaking tool calls
- Reversible compression — LLM can retrieve original data if needed (CCR architecture)
Zero code changes required — point your existing tools at the proxy.
30-Second Quickstart
# Install
pip install "headroom-ai[proxy]"
# Start proxy
headroom proxy --port 8787
# Verify
curl http://localhost:8787/health
Use with your tools:
# Claude Code
ANTHROPIC_BASE_URL=http://localhost:8787 claude
# Cursor / Continue / any OpenAI client
OPENAI_BASE_URL=http://localhost:8787/v1 cursor
# Python scripts
export OPENAI_BASE_URL=http://localhost:8787/v1
python your_script.py
That's it. You're saving tokens.
Verify It's Working
curl http://localhost:8787/stats
{
"tokens": {"saved": 12500, "savings_percent": 25.0},
"cost": {"total_savings_usd": 0.04}
}
Installation
pip install "headroom-ai[proxy]" # Proxy server (recommended)
pip install headroom-ai # SDK only
pip install "headroom-ai[all]" # Everything
Requirements: Python 3.10+
Features
| Feature | Description | Docs |
|---|---|---|
| SmartCrusher | Compresses JSON tool outputs statistically | Transforms |
| CacheAligner | Stabilizes prefixes for provider caching | Transforms |
| RollingWindow | Manages context limits without breaking tools | Transforms |
| CCR | Reversible compression with automatic retrieval | CCR Guide |
| Text Utilities | Opt-in compression for search/logs | Text Compression |
| LLMLingua-2 | ML-based 20x compression (opt-in) | LLMLingua |
Providers
| Provider | Token Counting | Cache Optimization |
|---|---|---|
| OpenAI | tiktoken (exact) | Automatic prefix caching |
| Anthropic | Official API | cache_control blocks |
| Official API | Context caching | |
| Cohere | Official API | - |
| Mistral | Official tokenizer | - |
Performance
| Scenario | Before | After | Savings |
|---|---|---|---|
| Search results (1000 items) | 45,000 tokens | 4,500 tokens | 90% |
| Log analysis (500 entries) | 22,000 tokens | 3,300 tokens | 85% |
| Long conversation (50 turns) | 80,000 tokens | 32,000 tokens | 60% |
Overhead: ~1-5ms per request.
Safety
- Never removes human content — User/assistant messages are never compressed
- Never breaks tool ordering — Tool calls and responses stay paired
- Parse failures are no-ops — Malformed content passes through unchanged
- Compression is reversible — LLM can retrieve original data via CCR
Documentation
| Guide | Description |
|---|---|
| SDK Guide | Wrap your client for fine-grained control |
| Proxy Guide | Production deployment |
| Configuration | All configuration options |
| CCR Guide | Reversible compression architecture |
| Metrics | Monitoring and observability |
| Troubleshooting | Common issues |
| Architecture | How it works internally |
Examples
See examples/ for runnable code:
basic_usage.py— Simple SDK usageproxy_integration.py— Using with different clientsccr_demo.py— CCR architecture demonstration
Contributing
git clone https://github.com/chopratejas/headroom.git
cd headroom
pip install -e ".[dev]"
pytest
See CONTRIBUTING.md for details.
License
Apache License 2.0 — see LICENSE.
Built for the AI developer community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file headroom_ai-0.2.2.tar.gz.
File metadata
- Download URL: headroom_ai-0.2.2.tar.gz
- Upload date:
- Size: 364.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
636f3e47abfad88434d12a03d90705210137bcbe05a1f02bdac6805ffa21e098
|
|
| MD5 |
38cca46f69c4a5ed8b0614aade5240df
|
|
| BLAKE2b-256 |
4672b35f6cf2339b1bbbf8bbd0df0ad6be813058d6e726033efac380ed3ec128
|
File details
Details for the file headroom_ai-0.2.2-py3-none-any.whl.
File metadata
- Download URL: headroom_ai-0.2.2-py3-none-any.whl
- Upload date:
- Size: 282.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a2f0cc100243568cb7738fc174c60abf93d3c0cb59cc8a0479b6ecd7576b792
|
|
| MD5 |
b5c942a141e75fb5505c9f7c7be4fd1f
|
|
| BLAKE2b-256 |
a121f50fc5512578c52c9ce1223a7b6b5240d883ed5ec14d2738f80494d0b289
|