Multi-layer context optimization proxy for LLM agents
Project description
Kompact
Multi-layer context optimization proxy for LLM agents. Reduces token usage by 40-70% with zero information loss.
Agent ──> Kompact Proxy (localhost:7878) ──> LLM Provider
│
├─ Layer 1: Schema Optimizer (TF-IDF tool selection)
├─ Layer 2: Content Compressors (TOON, JSON, code, logs)
├─ Layer 2b: Extractive Compressor (query-aware sentence selection)
├─ Layer 3: Observation Masker (history management)
└─ Layer 4: Cache Aligner (prefix cache optimization)
Quick Start
# Install
uv sync
# Start proxy
uv run kompact proxy --port 7878
# Point your agent at it
export ANTHROPIC_BASE_URL=http://localhost:7878
claude # or any Anthropic/OpenAI-compatible agent
How It Works
Kompact is a transparent HTTP proxy. No code changes needed — just change your base URL. It intercepts LLM API requests, applies a pipeline of transforms to compress the context, then forwards the optimized request to the provider.
| Transform | Target | Savings | Cost |
|---|---|---|---|
| TOON | JSON arrays of objects | 30-60% | Zero (string manipulation) |
| JSON Crusher | Structured JSON data | 40-80% | Minimal (Counter stats) |
| Code Compressor | Code in tool results | ~70% | Regex parse |
| Log Compressor | Repetitive log output | 60-90% | Regex dedup |
| Content Compressor | Long prose/text | 25-55% | TF-IDF scoring |
| Schema Optimizer | Tool definitions | 50-90% | TF-IDF cosine similarity |
| Observation Masker | Old tool outputs | ~50% | Zero (placeholder swap) |
| Cache Aligner | System prompts | Provider cache discount | Regex substitution |
The pipeline adapts automatically — short contexts get light compression, long contexts get aggressive optimization.
Configuration
# Disable specific transforms
uv run kompact proxy --port 7878 --disable toon --disable log_compressor
# Verbose mode
uv run kompact proxy --port 7878 --verbose
# View live dashboard
open http://localhost:7878/dashboard
Benchmarks
Tested against Headroom and LLMLingua-2 on real datasets (BFCL, HotpotQA, Glaive, LongBench) using context-bench.
Search-heavy scenario (100 JSON results, 3 needles):
| System | Compression | NIAH | Effective Ratio |
|---|---|---|---|
| Headroom | 0.0% | 100% | 0.0% |
| LLMLingua-2 | 55.4% | 0% | -44.6% |
| Truncation (50%) | 50.0% | 33% | -16.6% |
| Kompact | 47.7% | 100% | 47.7% |
Effective ratio accounts for retry cost: if compression destroys information (NIAH miss), you pay for both the failed attempt and the retry with full context. Negative = worse than no compression.
# Run on real datasets
uv run python benchmarks/run_dataset_eval.py --dataset bfcl -n 100
# Run synthetic scenarios
uv run python benchmarks/run_comparison.py --scenario search
# Exclude slow baselines
uv run python benchmarks/run_comparison.py --scenario search --exclude llmlingua headroom
See benchmarks/README.md for full methodology.
Development
# Install with dev deps
uv sync --extra dev
# Run tests
uv run pytest
# Lint
uv run ruff check src/ tests/
# Run single transform test
uv run pytest tests/test_toon.py -v
Architecture
src/kompact/
├── proxy/server.py # FastAPI proxy (Anthropic + OpenAI)
├── parser/messages.py # Provider format ↔ internal types
├── transforms/
│ ├── pipeline.py # Orchestration + adaptive scaling
│ ├── toon.py # JSON array → tabular (TOON format)
│ ├── json_crusher.py # Statistical JSON compression
│ ├── code_compressor.py # Code → skeleton extraction
│ ├── log_compressor.py # Log deduplication
│ ├── content_compressor.py # Extractive text compression (TF-IDF)
│ ├── schema_optimizer.py # TF-IDF tool selection
│ ├── observation_masker.py # History management
│ └── cache_aligner.py # Prefix cache optimization
├── cache/store.py # Compression store + artifact index
├── config.py # Per-transform configuration
├── types.py # Core data models
└── metrics/tracker.py # Per-request metrics
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kompact-0.1.0.tar.gz.
File metadata
- Download URL: kompact-0.1.0.tar.gz
- Upload date:
- Size: 73.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5dbd52cd98286c9cd56d056674813ac4d38695db0ccbf66a6d91508a95ec27ec
|
|
| MD5 |
2769dc923637c9ae94a3070ff11954d8
|
|
| BLAKE2b-256 |
2dfe89856d3f25ee78c68eac3d1ad58e3b416e990d2cfcf526fd303ad2826593
|
Provenance
The following attestation bundles were made for kompact-0.1.0.tar.gz:
Publisher:
publish.yml on npow/kompact
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kompact-0.1.0.tar.gz -
Subject digest:
5dbd52cd98286c9cd56d056674813ac4d38695db0ccbf66a6d91508a95ec27ec - Sigstore transparency entry: 976130722
- Sigstore integration time:
-
Permalink:
npow/kompact@da940e293dbbb4af40b897a3ed5d3fcc1efd9033 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/npow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@da940e293dbbb4af40b897a3ed5d3fcc1efd9033 -
Trigger Event:
release
-
Statement type:
File details
Details for the file kompact-0.1.0-py3-none-any.whl.
File metadata
- Download URL: kompact-0.1.0-py3-none-any.whl
- Upload date:
- Size: 39.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
768fe7beadf2279aac3dbd15815e3ac53eae203226388553a45db39972da4ad9
|
|
| MD5 |
418e8793171440d98dfb514fccacad08
|
|
| BLAKE2b-256 |
81b435c85929da1cf3b3bcfb21970c065a47bab7beda777cfe624f0af783466b
|
Provenance
The following attestation bundles were made for kompact-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on npow/kompact
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kompact-0.1.0-py3-none-any.whl -
Subject digest:
768fe7beadf2279aac3dbd15815e3ac53eae203226388553a45db39972da4ad9 - Sigstore transparency entry: 976130729
- Sigstore integration time:
-
Permalink:
npow/kompact@da940e293dbbb4af40b897a3ed5d3fcc1efd9033 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/npow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@da940e293dbbb4af40b897a3ed5d3fcc1efd9033 -
Trigger Event:
release
-
Statement type: