Multi-layer context optimization proxy for LLM agents

These details have not been verified by PyPI

Project description

Kompact

Context compression proxy for LLM agents. Sits between your agent and the LLM provider, compresses context on the fly, and cuts your token bill 40-70% — with zero code changes.

Save real money

For a team running 1,000 agentic requests/day with ~10K token contexts:

Model	Without Kompact	With Kompact	Monthly Savings
Sonnet ($3/M)	$900/mo	$405/mo	$495/mo
Opus ($15/M)	$4,500/mo	$2,025/mo	$2,475/mo
GPT-4o ($2.50/M)	$750/mo	$338/mo	$412/mo

Savings scale linearly. 10K requests/day = 10x the numbers above.

Get started in 30 seconds

pip install kompact   # or: uv add kompact
kompact proxy --port 7878

export ANTHROPIC_BASE_URL=http://localhost:7878
# That's it. Your agent now uses fewer tokens.

No SDK changes. No prompt rewriting. Just point your base URL at the proxy.

Quality stays intact

Evaluated on BFCL (1,431 real API schemas) — the standard benchmark for tool-calling agents. End-to-end through Claude, scored with context-bench.

Quality impact vs no compression (closer to 0% = better):

Model	Kompact	Headroom	LLMLingua-2
Haiku	-2.6%	-3.0%	-23.4%
Sonnet	-3.9%	-3.5%	-20.6%
Opus	-0.5%	-0.5%	-27.3%

Kompact and Headroom both stay within ~3% of baseline. LLMLingua-2 destroys tool schemas regardless of model (-20 to -27%).

Compression across content types

Measured offline on 12,795 examples across 3 datasets:

Dataset	Examples	Kompact	Headroom	LLMLingua-2
BFCL (tool schemas)	1,431	55.3%	~0%	55.4%
Glaive (tool calling)	3,959	56.6%	~0%	~50%
HotpotQA (prose QA)	7,405	17.9%	~0%	49.9%

Headroom's SmartCrusher doesn't compress JSON — it's designed for prose. LLMLingua-2 compresses aggressively but destroys information (see quality table above).

How it works

Kompact is a transparent HTTP proxy. It intercepts LLM API requests, compresses the context, then forwards to the provider.

        ┌──────────────────────────────────────────────┐
        │           Kompact Proxy (:7878)              │
        │                                              │
Agent ─>│  1. Schema Optimizer    (TF-IDF selection)   │─> LLM Provider
        │  2. Content Compressors (TOON, JSON, code)   │
        │  3. Extractive Compress (TF-IDF sentences)   │
        │  4. Observation Masker  (history mgmt)       │
        │  5. Cache Aligner       (prefix caching)     │
        │                                              │
        └──────────────────────────────────────────────┘

8 transforms, each targeting a different content type. The pipeline adapts automatically — short contexts get light compression, long contexts get aggressive optimization. Sub-millisecond overhead.

Per-request control

Disable transforms for a single request without affecting other clients using the X-Kompact-Disable header:

# Anthropic SDK
client.messages.create(..., extra_headers={"X-Kompact-Disable": "toon,code_compressor"})

# OpenAI SDK
client.chat.completions.create(..., extra_headers={"X-Kompact-Disable": "toon,code_compressor"})

Comma-separated transform names: toon, json_crusher, code_compressor, log_compressor, content_compressor, observation_masker, cache_aligner, schema_optimizer.

Monitoring

Kompact exports OpenTelemetry metrics (on by default, disable with --no-otel). A Prometheus + Grafana stack is included:

cd monitoring
docker compose up -d

Grafana dashboard: http://localhost:9473 (pre-built "Kompact" dashboard)
Prometheus: http://localhost:9090
Metrics endpoint: http://localhost:9464/metrics

The dashboard shows request rate, token savings, compression ratio, pipeline latency percentiles, and per-transform breakdowns.

Running benchmarks

# Offline compression (no LLM calls, measures compression + needle preservation)
uv run python benchmarks/run_dataset_eval.py --dataset bfcl

# End-to-end quality (sends through proxy chain, measures LLM answer quality)
# Requires: claude-relay running on :8084, kompact on :7878
uv run python benchmarks/run_e2e_eval.py --dataset bfcl --model haiku --workers 20

See benchmarks/README.md for full methodology.

Development

uv sync --extra dev
uv run pytest          # 48 tests
uv run ruff check src/ tests/

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.0

May 14, 2026

0.3.1

May 14, 2026

0.3.0

Mar 21, 2026

0.2.0

Mar 21, 2026

0.1.0

Feb 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kompact-0.4.0.tar.gz (90.2 kB view details)

Uploaded May 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kompact-0.4.0-py3-none-any.whl (46.5 kB view details)

Uploaded May 14, 2026 Python 3

File details

Details for the file kompact-0.4.0.tar.gz.

File metadata

Download URL: kompact-0.4.0.tar.gz
Upload date: May 14, 2026
Size: 90.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kompact-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`6b01176bfa95d615675e5c27d60d3a16148e587e098ee5ace6649ef205bb2b3a`
MD5	`556815e6d23c0f5a58846947974b451f`
BLAKE2b-256	`f4607f82bffe1fd9ba09676edaa9a1c14e8e661e9aec344481d7d9c411e64afc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kompact-0.4.0.tar.gz:

Publisher: publish.yml on npow/kompact

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kompact-0.4.0.tar.gz
- Subject digest: 6b01176bfa95d615675e5c27d60d3a16148e587e098ee5ace6649ef205bb2b3a
- Sigstore transparency entry: 1532079870
- Sigstore integration time: May 14, 2026
Source repository:
- Permalink: npow/kompact@20185433e0a446c8421fa58ee36c66fc3e0d3e5d
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/npow
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@20185433e0a446c8421fa58ee36c66fc3e0d3e5d
- Trigger Event: release

File details

Details for the file kompact-0.4.0-py3-none-any.whl.

File metadata

Download URL: kompact-0.4.0-py3-none-any.whl
Upload date: May 14, 2026
Size: 46.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kompact-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0d6fa576ee368332f7230dfd9dbb4353b653c48290f47ccf5315835208bd0906`
MD5	`9ac8dbcf152f3aea0e010324f68421e8`
BLAKE2b-256	`d95d8533b3e81a1b4d97b2d16383e461de24565e5cac874067f775b98721e2df`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kompact-0.4.0-py3-none-any.whl:

Publisher: publish.yml on npow/kompact

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kompact-0.4.0-py3-none-any.whl
- Subject digest: 0d6fa576ee368332f7230dfd9dbb4353b653c48290f47ccf5315835208bd0906
- Sigstore transparency entry: 1532080012
- Sigstore integration time: May 14, 2026
Source repository:
- Permalink: npow/kompact@20185433e0a446c8421fa58ee36c66fc3e0d3e5d
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/npow
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@20185433e0a446c8421fa58ee36c66fc3e0d3e5d
- Trigger Event: release

kompact 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Kompact

Save real money

Get started in 30 seconds

Quality stays intact

Compression across content types

How it works

Per-request control

Monitoring

Running benchmarks

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance