Skip to main content

Incremental context compression for LLMs with anchored summaries

Project description

context-compressor

Intro

A simple but effective context compressor, supports incremental context compression for LLMs with persistent anchored summaries.

Based on the algorithm from Factory.ai, this library efficiently manages finite context windows in extended conversations and multi-step workflows.

Features:

  • Incremental Updates: Only summarize newly dropped messages
  • Anchor Points: Each summary is linked to a specific message turn
  • Efficient Compression: Dramatically reduces computation and cost

Diagram

Installation

# Install from PyPI
pip install context-compressor-llm

# Install from source
git clone https://github.com/LaguePesikin/context-compressor
cd context-compressor
pip install -e .

Quick Start

from context_compressor import ContextCompressor, TokenCounter

# Define your summarizer function
def simple_summarizer(messages_list, previous_summary=None):
    """
    Args:
        messages_list: List of dicts like [{"role": "user", "content": "..."}]
        previous_summary: Optional previous summary to build upon
    Returns:
        A summary string
    """
    summary_parts = []
    
    if previous_summary:
        summary_parts.append(f"[Previous: {previous_summary}]")
    for msg in messages_list:
        role = msg["role"]
        content = msg["content"]
        # Take first 50 chars of each message
        snippet = content[:50].replace("\n", " ")
        summary_parts.append(f"{role.upper()}: {snippet}...")
    return "\n".join(summary_parts)

# Initialize compressor
compressor = ContextCompressor(
    summarizer=simple_summarizer,
    t_max=8000,      # Max tokens before compression
    t_retained=6000, # Tokens to keep after compression
    t_summary=500,   # Reserved tokens for summary
    tokenizer=TokenCounter(
        model_name="gpt-4o",
        use_transformers=False   # Will use default tiktoken encoding
    )
)

# Add messages to your conversation
for _ in range(30):
    compressor.add_message("Hello, how are you?", role="user")
    compressor.add_message("I'm doing well, thanks!", role="assistant")

# Get compressed context (auto-compresses if needed)
context = compressor.get_current_context()

# View statistics
stats = compressor.get_stats()
print(f"Compressions: {stats['compression_count']}")
print(f"Tokens saved: {stats['total_tokens_saved']}")

Expected Output

Warning: Summary is too long (2813 tokens).
Compressions: 1
Tokens saved: 291

Built-In Summarizers

The library provides three common summarization strategies out of the box:

1. Truncate Summarizer

Keeps the first N characters of each message with ellipsis:

from context_compressor import ContextCompressor, TruncateSummarizer

summarizer = TruncateSummarizer(
    max_chars=50,           # Characters to keep per message
    ellipsis="...",         # Suffix for truncated messages
    include_previous=True   # Include previous summaries
)

compressor = ContextCompressor(summarizer=summarizer)

2. Head-Tail Summarizer

Keeps first N and last M messages, omitting the middle:

from context_compressor import HeadTailSummarizer

summarizer = HeadTailSummarizer(
    head_count=3,    # Keep first 3 messages
    tail_count=2,    # Keep last 2 messages
    middle_placeholder="[... {count} messages omitted ...]"
)

compressor = ContextCompressor(summarizer=summarizer)

3. LLM Summarizer

Uses an LLM API for intelligent summarization:

from openai import OpenAI
from context_compressor import LLMSummarizer

client = OpenAI()
summarizer = LLMSummarizer(
    client=client,
    model="gpt-4o-mini",
    max_tokens=200,
    api_type="openai"
)

compressor = ContextCompressor(summarizer=summarizer)

Core Functionality

ContextCompressor

Parameters:

  • summarizer: Custom text summarization function that takes message text and optional previous summary, returns a new summary. View examples/basic_usage.llm_summarizer_example for a fundamental implementation.
  • t_max: Maximum token threshold. Context compression is triggered when this limit is exceeded
  • t_retained: Expected token count to retain after compression. The ratio t_retained/t_max determines the compression rate
  • t_summary: Length of the context summary. This parameter takes effect through prompt engineering in your summarizer (if using LLM) and the _compress method
  • tokenizer: Custom Tokenizer (you can settiktoken or transformers.AutoTokenizer here). See context_compressor.tokenizer.TokenCounter for more details.

Citation

Based on the approach described in: Factory.ai: Compressing Context

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_compressor_llm-0.1.2.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context_compressor_llm-0.1.2-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file context_compressor_llm-0.1.2.tar.gz.

File metadata

  • Download URL: context_compressor_llm-0.1.2.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for context_compressor_llm-0.1.2.tar.gz
Algorithm Hash digest
SHA256 98b1f0e2491911a229d9e2499eac145d79609ffb94d2740b11db4c1600e67c2c
MD5 48d7b4314b90ad29198408cb46b33cc2
BLAKE2b-256 33d2592634d44b4f9bc371649610efbc9f07f8ff25b4459c840302a4382c969a

See more details on using hashes here.

File details

Details for the file context_compressor_llm-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for context_compressor_llm-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 af85f2eb5e17c4719212193d49546261f315e74fa8be362e879856bcae4fed24
MD5 96d3c405c508c9a32d01e4fefd2cda29
BLAKE2b-256 3285b7cc3bbae952f4323962bb70d3ad7efbb03612aa514aeb17b6abd975fa4f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page