Smart context compression for LLM agents — preserve critical info, reduce tokens 40-80%
Project description
Agent Context Compressor 🗜️
Smart context compression for LLM agents. Preserves critical information (decisions, errors, preferences, code) while reducing token usage by 40-80%.
Problem
LLM context windows are expensive. 90% of agent conversations contain filler — greetings, acknowledgments, repeated tool outputs, verbose explanations. But naive truncation drops critical info like decisions, errors, and user preferences.
Solution
Context Compressor scores every message by importance, then strategically drops low-value content while guaranteeing critical information is preserved.
Before: 603 tokens, 14 messages
After: 410 tokens, 5 messages (32% compressed, 98.7% confidence)
Features
- 🧠 Smart scoring — classifies messages as decisions, errors, code, preferences, noise
- 🔒 Critical preservation — decisions, errors, preferences NEVER dropped
- 🔄 Deduplication — removes near-duplicate messages (Jaccard similarity)
- 📝 Summarization — long messages compressed instead of dropped
- 📊 Confidence tracking — know exactly how much info is preserved
- 🛠️ Multiple interfaces — Python library, CLI, REST API
Quick Start
from context_compressor import compress
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "yo"},
{"role": "assistant", "content": "Hey! How can I help?"},
{"role": "user", "content": "deploy the app to production"},
{"role": "assistant", "content": "Deployed! Status: ✅ running"},
{"role": "user", "content": "thanks"},
{"role": "assistant", "content": "👍"},
]
result = compress(messages, target_ratio=0.3)
print(f"Saved {result.tokens_saved} tokens ({result.compression_ratio:.0%})")
print(f"Confidence: {result.confidence:.0%}")
print(result.compressed)
CLI
# Install
pip install -e .
# Compress from file
cat conversation.json | ctxcompress --ratio 0.3
# Compress with stats
ctxcompress --input chat.json --ratio 0.2 --format full
# Stats only
ctxcompress --input chat.json --stats-only
# JSON output
ctxcompress --input chat.json --json
API Server
# Install with server deps
pip install -e ".[server]"
# Start server
ctxcompress-server
# → http://localhost:8000
# Compress via API
curl -X POST http://localhost:8000/compress \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "Hi!"},
{"role": "user", "content": "deploy app"},
{"role": "assistant", "content": "Deployed successfully"}
],
"target_ratio": 0.3
}'
How It Works
Input Messages
│
▼
┌─────────────┐
│ Scorer │ Classify: decision/error/code/preference/noise
└──────┬──────┘
│
▼
┌─────────────┐
│ Deduplicat │ Remove near-duplicate messages
└──────┬──────┘
│
▼
┌─────────────┐
│ Merge Tool │ Combine consecutive tool results
└──────┬──────┘
│
▼
┌─────────────┐
│ Priority │ Drop lowest-scored until target ratio
│ Drop │ Preserve CRITICAL messages always
└──────┬──────┘
│
▼
┌─────────────┐
│ Summarize │ Compress long messages instead of dropping
└──────┬──────┘
│
▼
Compressed Context + Metadata
Scoring Categories
| Category | Importance | Droppable | Examples |
|---|---|---|---|
| system | CRITICAL | ❌ | System prompts |
| decision | CRITICAL | ❌ | "Let's use approach A" |
| error | CRITICAL | ❌ | Error messages, fixes |
| preference | CRITICAL | ❌ | "I prefer dark mode" |
| code | HIGH | ⚠️ | Code blocks, scripts |
| tool_result | HIGH | ⚠️ | API responses, outputs |
| user_query | HIGH | ⚠️ | User questions |
| structured_response | MEDIUM | ✅ | Lists, explanations |
| explanation | MEDIUM | ✅ | Long explanations |
| brief_response | LOW | ✅ | Short replies |
| noise | NOISE | ✅ | "yo", "sip", "👍" |
Use Cases
- Agent context management — Keep conversations within token limits
- Cost optimization — Reduce API costs by 40-80%
- Session handoff — Compress before switching models
- Memory systems — Store compressed conversation summaries
- Multi-agent — Share compressed context between agents
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_ctx_compress-0.1.0.tar.gz.
File metadata
- Download URL: agent_ctx_compress-0.1.0.tar.gz
- Upload date:
- Size: 14.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb5e36d6096bcfe9d10bc137923694d97bd43ffcfd74084df63499f0b3d6b2df
|
|
| MD5 |
3b4cd5103e8d978b148aef3ee0ea1389
|
|
| BLAKE2b-256 |
7fb6bb221d3c474fb436a768608f4163a1cd1c466e2a0e609d2908845e5edf4f
|
File details
Details for the file agent_ctx_compress-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agent_ctx_compress-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f97f4860db4d76131cc47d60ef7b858333ea509b95dcb93a67c99fa4e73b076b
|
|
| MD5 |
c3b987188644aece1998f7b1153df263
|
|
| BLAKE2b-256 |
b599892e28f6ba3da80f5e80e557faede2783cfbfbcb44a139c03d0ff8f2dae6
|