Skip to main content

Incremental context compression for LLMs with anchored summaries

Project description

context-compressor

Intro

A simple but effective context compressor, supports incremental context compression for LLMs with persistent anchored summaries.

Based on the algorithm from Factory.ai, this library efficiently manages finite context windows in extended conversations and multi-step workflows.

Features:

  • Incremental Updates: Only summarize newly dropped messages
  • Anchor Points: Each summary is linked to a specific message turn
  • Efficient Compression: Dramatically reduces computation and cost

Diagram

Installation

# Install from PyPI
pip install context-compressor-llm

# Install from source
git clone https://github.com/LaguePesikin/context-compressor
cd context-compressor
pip install -e .

Quick Start

from context_compressor import ContextCompressor, TokenCounter

# Define your summarizer function
def simple_summarizer(messages_list, previous_summary=None):
    """
    Args:
        messages_list: List of dicts like [{"role": "user", "content": "..."}]
        previous_summary: Optional previous summary to build upon
    Returns:
        A summary string
    """
    summary_parts = []
    
    if previous_summary:
        summary_parts.append(f"[Previous: {previous_summary}]")
    for msg in messages_list:
        role = msg["role"]
        content = msg["content"]
        # Take first 50 chars of each message
        snippet = content[:50].replace("\n", " ")
        summary_parts.append(f"{role.upper()}: {snippet}...")
    return "\n".join(summary_parts)

# Initialize compressor
compressor = ContextCompressor(
    summarizer=simple_summarizer,
    t_max=8000,      # Max tokens before compression
    t_retained=6000, # Tokens to keep after compression
    t_summary=500,   # Reserved tokens for summary
    tokenizer=TokenCounter(
        model_name="gpt-4o",
        use_transformers=False   # Will use default tiktoken encoding
    )
)

# Add messages to your conversation
for _ in range(30):
    compressor.add_message("Hello, how are you?", role="user")
    compressor.add_message("I'm doing well, thanks!", role="assistant")

# Get compressed context (auto-compresses if needed)
context = compressor.get_current_context()

# View statistics
stats = compressor.get_stats()
print(f"Compressions: {stats['compression_count']}")
print(f"Tokens saved: {stats['total_tokens_saved']}")

Expected Output

Warning: Summary is too long (2813 tokens).
Compressions: 1
Tokens saved: 291

Core Functionality

ContextCompressor

Parameters:

  • summarizer: Custom text summarization function that takes message text and optional previous summary, returns a new summary. View examples/basic_usage.llm_summarizer_example for a fundamental implementation.
  • t_max: Maximum token threshold. Context compression is triggered when this limit is exceeded
  • t_retained: Expected token count to retain after compression. The ratio t_retained/t_max determines the compression rate
  • t_summary: Length of the context summary. This parameter takes effect through prompt engineering in your summarizer (if using LLM) and the _compress method
  • tokenizer: Custom Tokenizer (you can settiktoken or transformers.AutoTokenizer here). See context_compressor.tokenizer.TokenCounter for more details.

Citation

Based on the approach described in: Factory.ai: Compressing Context

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_compressor_llm-0.1.1.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context_compressor_llm-0.1.1-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file context_compressor_llm-0.1.1.tar.gz.

File metadata

  • Download URL: context_compressor_llm-0.1.1.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for context_compressor_llm-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6769c057d74c991b4273035137a73b45f1bbc28b1d2d57dae409f95ee4ee5671
MD5 cab9998d5e2c9c617c69b0e1a7002fc9
BLAKE2b-256 4baf47e639cd5fc676265898f8a36b1d7cdec50d441d3b3a6a0e5c16ca16e843

See more details on using hashes here.

File details

Details for the file context_compressor_llm-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for context_compressor_llm-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 599980b44d010d6d696b5dd887782f95829642d9c22bfce0722756e7db4f3823
MD5 9076bbc3c463d826462181f13dbbc960
BLAKE2b-256 15a0a65301d5a85ff2b22bac2d60c24e48ca44f7a8fb2a3864a9ec8c45927f24

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page