Incremental context compression for LLMs with anchored summaries
Project description
context-compressor
Intro
A simple but effective context compressor, supports incremental context compression for LLMs with persistent anchored summaries.
Based on the algorithm from Factory.ai, this library efficiently manages finite context windows in extended conversations and multi-step workflows.
Features:
- Incremental Updates: Only summarize newly dropped messages
- Anchor Points: Each summary is linked to a specific message turn
- Efficient Compression: Dramatically reduces computation and cost
Diagram
Installation
# Install from PyPI
pip install context-compressor-llm
# Install from source
git clone https://github.com/LaguePesikin/context-compressor
cd context-compressor
pip install -e .
Quick Start
from context_compressor import ContextCompressor, TokenCounter
# Define your summarizer function
def simple_summarizer(messages_list, previous_summary=None):
"""
Args:
messages_list: List of dicts like [{"role": "user", "content": "..."}]
previous_summary: Optional previous summary to build upon
Returns:
A summary string
"""
summary_parts = []
if previous_summary:
summary_parts.append(f"[Previous: {previous_summary}]")
for msg in messages_list:
role = msg["role"]
content = msg["content"]
# Take first 50 chars of each message
snippet = content[:50].replace("\n", " ")
summary_parts.append(f"{role.upper()}: {snippet}...")
return "\n".join(summary_parts)
# Initialize compressor
compressor = ContextCompressor(
summarizer=simple_summarizer,
t_max=8000, # Max tokens before compression
t_retained=6000, # Tokens to keep after compression
t_summary=500, # Reserved tokens for summary
tokenizer=TokenCounter(
model_name="gpt-4o",
use_transformers=False # Will use default tiktoken encoding
)
)
# Add messages to your conversation
for _ in range(30):
compressor.add_message("Hello, how are you?", role="user")
compressor.add_message("I'm doing well, thanks!", role="assistant")
# Get compressed context (auto-compresses if needed)
context = compressor.get_current_context()
# View statistics
stats = compressor.get_stats()
print(f"Compressions: {stats['compression_count']}")
print(f"Tokens saved: {stats['total_tokens_saved']}")
Expected Output
Warning: Summary is too long (2813 tokens).
Compressions: 1
Tokens saved: 291
Built-In Summarizers
The library provides three common summarization strategies out of the box:
1. Truncate Summarizer
Keeps the first N characters of each message with ellipsis:
from context_compressor import ContextCompressor, TruncateSummarizer
summarizer = TruncateSummarizer(
max_chars=50, # Characters to keep per message
ellipsis="...", # Suffix for truncated messages
include_previous=True # Include previous summaries
)
compressor = ContextCompressor(summarizer=summarizer)
2. Head-Tail Summarizer
Keeps first N and last M messages, omitting the middle:
from context_compressor import HeadTailSummarizer
summarizer = HeadTailSummarizer(
head_count=3, # Keep first 3 messages
tail_count=2, # Keep last 2 messages
middle_placeholder="[... {count} messages omitted ...]"
)
compressor = ContextCompressor(summarizer=summarizer)
3. LLM Summarizer
Uses an LLM API for intelligent summarization:
from openai import OpenAI
from context_compressor import LLMSummarizer
client = OpenAI()
summarizer = LLMSummarizer(
client=client,
model="gpt-4o-mini",
max_tokens=200,
api_type="openai"
)
compressor = ContextCompressor(summarizer=summarizer)
Core Functionality
ContextCompressor
Parameters:
summarizer: Custom text summarization function that takes message text and optional previous summary, returns a new summary. Viewexamples/basic_usage.llm_summarizer_examplefor a fundamental implementation.t_max: Maximum token threshold. Context compression is triggered when this limit is exceededt_retained: Expected token count to retain after compression. The ratiot_retained/t_maxdetermines the compression ratet_summary: Length of the context summary. This parameter takes effect through prompt engineering in your summarizer (if using LLM) and the_compressmethodtokenizer: Custom Tokenizer (you can settiktokenortransformers.AutoTokenizerhere). Seecontext_compressor.tokenizer.TokenCounterfor more details.
Citation
Based on the approach described in: Factory.ai: Compressing Context
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file context_compressor_llm-0.1.2.tar.gz.
File metadata
- Download URL: context_compressor_llm-0.1.2.tar.gz
- Upload date:
- Size: 11.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98b1f0e2491911a229d9e2499eac145d79609ffb94d2740b11db4c1600e67c2c
|
|
| MD5 |
48d7b4314b90ad29198408cb46b33cc2
|
|
| BLAKE2b-256 |
33d2592634d44b4f9bc371649610efbc9f07f8ff25b4459c840302a4382c969a
|
File details
Details for the file context_compressor_llm-0.1.2-py3-none-any.whl.
File metadata
- Download URL: context_compressor_llm-0.1.2-py3-none-any.whl
- Upload date:
- Size: 10.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af85f2eb5e17c4719212193d49546261f315e74fa8be362e879856bcae4fed24
|
|
| MD5 |
96d3c405c508c9a32d01e4fefd2cda29
|
|
| BLAKE2b-256 |
3285b7cc3bbae952f4323962bb70d3ad7efbb03612aa514aeb17b6abd975fa4f
|