The Context Optimization Layer for LLM Applications - Cut costs by 50-90%

These details have not been verified by PyPI

Project links

Project description

Headroom

The Context Optimization Layer for LLM Applications

Cut your LLM costs by 50-90% without losing accuracy

What It Does

Headroom is a smart compression proxy for LLM applications:

Compresses tool outputs — 1000 search results → 15 items (keeps errors, anomalies, relevant items)
Enables provider caching — Stabilizes prefixes so cache hits actually happen
Manages context windows — Prevents token limit failures without breaking tool calls
Reversible compression — LLM can retrieve original data if needed (CCR architecture)

Zero code changes required — point your existing tools at the proxy.

30-Second Quickstart

# Install
pip install "headroom-ai[proxy]"

# Start proxy
headroom proxy --port 8787

# Verify
curl http://localhost:8787/health

Use with your tools:

# Claude Code
ANTHROPIC_BASE_URL=http://localhost:8787 claude

# Cursor / Continue / any OpenAI client
OPENAI_BASE_URL=http://localhost:8787/v1 cursor

# Python scripts
export OPENAI_BASE_URL=http://localhost:8787/v1
python your_script.py

That's it. You're saving tokens.

Verify It's Working

curl http://localhost:8787/stats

{
  "tokens": {"saved": 12500, "savings_percent": 25.0},
  "cost": {"total_savings_usd": 0.04}
}

Installation

pip install "headroom-ai[proxy]"     # Proxy server (recommended)
pip install headroom-ai              # SDK only
pip install "headroom-ai[all]"       # Everything

Requirements: Python 3.10+

Features

Feature	Description	Docs
SmartCrusher	Compresses JSON tool outputs statistically	Transforms
CacheAligner	Stabilizes prefixes for provider caching	Transforms
RollingWindow	Manages context limits without breaking tools	Transforms
CCR	Reversible compression with automatic retrieval	CCR Guide
Text Utilities	Opt-in compression for search/logs	Text Compression
LLMLingua-2	ML-based 20x compression (opt-in)	LLMLingua

Providers

Provider	Token Counting	Cache Optimization
OpenAI	tiktoken (exact)	Automatic prefix caching
Anthropic	Official API	cache_control blocks
Google	Official API	Context caching
Cohere	Official API	-
Mistral	Official tokenizer	-

Performance

Scenario	Before	After	Savings
Search results (1000 items)	45,000 tokens	4,500 tokens	90%
Log analysis (500 entries)	22,000 tokens	3,300 tokens	85%
Long conversation (50 turns)	80,000 tokens	32,000 tokens	60%

Overhead: ~1-5ms per request.

Safety

Never removes human content — User/assistant messages are never compressed
Never breaks tool ordering — Tool calls and responses stay paired
Parse failures are no-ops — Malformed content passes through unchanged
Compression is reversible — LLM can retrieve original data via CCR

Documentation

Guide	Description
SDK Guide	Wrap your client for fine-grained control
Proxy Guide	Production deployment
Configuration	All configuration options
CCR Guide	Reversible compression architecture
Metrics	Monitoring and observability
Troubleshooting	Common issues
Architecture	How it works internally

Examples

See examples/ for runnable code:

basic_usage.py — Simple SDK usage
proxy_integration.py — Using with different clients
ccr_demo.py — CCR architecture demonstration

Contributing

git clone https://github.com/chopratejas/headroom.git
cd headroom
pip install -e ".[dev]"
pytest

See CONTRIBUTING.md for details.

License

Apache License 2.0 — see LICENSE.

_{Built for the AI developer community}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.8.2

Apr 21, 2026

0.8.1

Apr 21, 2026

0.8.0

Apr 21, 2026

0.7.4

Apr 21, 2026

0.7.3

Apr 21, 2026

0.7.2

Apr 21, 2026

0.7.1

Apr 20, 2026

0.7.0

Apr 20, 2026

0.6.7

Apr 20, 2026

0.6.6

Apr 20, 2026

0.6.5

Apr 19, 2026

0.6.4

Apr 19, 2026

0.6.3

Apr 18, 2026

0.6.2

Apr 18, 2026

0.6.1

Apr 17, 2026

0.5.25

Apr 13, 2026

0.5.24

Apr 12, 2026

0.5.23

Apr 12, 2026

0.5.22

Apr 12, 2026

0.5.21

Apr 8, 2026

0.5.20

Apr 8, 2026

0.5.19

Apr 7, 2026

0.5.18

Apr 3, 2026

0.5.17

Mar 31, 2026

0.5.16

Mar 31, 2026

0.5.15

Mar 31, 2026

0.5.14

Mar 30, 2026

0.5.13

Mar 30, 2026

0.5.12

Mar 30, 2026

0.5.11

Mar 30, 2026

0.5.10

Mar 29, 2026

0.5.9

Mar 28, 2026

0.5.8

Mar 27, 2026

0.5.7

Mar 26, 2026

0.5.6

Mar 25, 2026

0.5.5

Mar 25, 2026

0.5.4

Mar 24, 2026

0.5.3

Mar 24, 2026

0.5.2

Mar 20, 2026

0.5.1

Mar 19, 2026

0.5.0

Mar 19, 2026

0.4.6

Mar 17, 2026

0.4.5

Mar 15, 2026

0.4.4

Mar 14, 2026

0.4.3

Mar 13, 2026

0.4.2

Mar 13, 2026

0.4.1

Mar 13, 2026

0.4.0

Mar 11, 2026

0.3.8

Mar 10, 2026

0.3.7

Feb 19, 2026

0.3.6

Feb 19, 2026

0.3.5

Feb 19, 2026

0.3.4

Feb 16, 2026

0.3.3

Feb 11, 2026

0.3.2

Feb 11, 2026

0.3.1

Feb 2, 2026

0.3.0

Jan 31, 2026

0.2.15

Jan 21, 2026

0.2.14

Jan 20, 2026

0.2.13

Jan 19, 2026

0.2.12

Jan 18, 2026

0.2.10

Jan 17, 2026

0.2.9

Jan 17, 2026

0.2.8

Jan 16, 2026

0.2.7

Jan 16, 2026

0.2.6

Jan 16, 2026

0.2.5

Jan 16, 2026

0.2.4

Jan 15, 2026

This version

0.2.2

Jan 14, 2026

0.2.1

Jan 10, 2026

0.2.0

Jan 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

headroom_ai-0.2.2.tar.gz (364.5 kB view details)

Uploaded Jan 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

headroom_ai-0.2.2-py3-none-any.whl (282.3 kB view details)

Uploaded Jan 14, 2026 Python 3

File details

Details for the file headroom_ai-0.2.2.tar.gz.

File metadata

Download URL: headroom_ai-0.2.2.tar.gz
Upload date: Jan 14, 2026
Size: 364.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for headroom_ai-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`636f3e47abfad88434d12a03d90705210137bcbe05a1f02bdac6805ffa21e098`
MD5	`38cca46f69c4a5ed8b0614aade5240df`
BLAKE2b-256	`4672b35f6cf2339b1bbbf8bbd0df0ad6be813058d6e726033efac380ed3ec128`

See more details on using hashes here.

File details

Details for the file headroom_ai-0.2.2-py3-none-any.whl.

File metadata

Download URL: headroom_ai-0.2.2-py3-none-any.whl
Upload date: Jan 14, 2026
Size: 282.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for headroom_ai-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8a2f0cc100243568cb7738fc174c60abf93d3c0cb59cc8a0479b6ecd7576b792`
MD5	`b5c942a141e75fb5505c9f7c7be4fd1f`
BLAKE2b-256	`a121f50fc5512578c52c9ce1223a7b6b5240d883ed5ec14d2738f80494d0b289`

See more details on using hashes here.

headroom-ai 0.2.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Headroom

What It Does

30-Second Quickstart

Verify It's Working

Installation

Features

Providers

Performance

Safety

Documentation

Examples

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes