Skip to main content

Incremental context compression for LLMs with anchored summaries

Project description

context-compressor

Intro

A simple but effective context compressor, supports incremental context compression for LLMs with persistent anchored summaries.

Based on the algorithm from Factory.ai, this library efficiently manages finite context windows in extended conversations and multi-step workflows.

Features:

  • Incremental Updates: Only summarize newly dropped messages
  • Anchor Points: Each summary is linked to a specific message turn
  • Efficient Compression: Dramatically reduces computation and cost

Installation

git clone https://github.com/LaguePesikin/context-compressor
cd context-compressor
pip install -e .
# for developers run: pip install -e ".[dev]"

or

pip install context-compressor

Quick Start

from context_compressor import ContextCompressor

# Define your summarizer function
def my_summarizer(messages_text, previous_summary=None):
    # Use your LLM of choice (OpenAI, Anthropic, etc.)
    if previous_summary:
        prompt = f"Update this summary with new info:\n{previous_summary}\n\nNew: {messages_text}"
    else:
        prompt = f"Summarize: {messages_text}"
    
    # Call your LLM here
    return your_llm_call(prompt)

# Initialize compressor
compressor = ContextCompressor(
    summarizer=my_summarizer,
    t_max=8000,      # Max tokens before compression
    t_retained=6000, # Tokens to keep after compression
    t_summary=500,   # Reserved tokens for summary
)

# Add messages to your conversation
compressor.add_message("Hello, how are you?", role="user")
compressor.add_message("I'm doing well, thanks!", role="assistant")

# Get compressed context (auto-compresses if needed)
context = compressor.get_current_context()

# View statistics
stats = compressor.get_stats()
print(f"Compressions: {stats['compression_count']}")
print(f"Tokens saved: {stats['total_tokens_saved']}")

Core Functionality

ContextCompressor

Parameters:

  • summarizer: Custom text summarization function that takes message text and optional previous summary, returns a new summary. View examples/basic_usage.llm_summarizer_example for a fundamental implementation.
  • t_max: Maximum token threshold. Context compression is triggered when this limit is exceeded
  • t_retained: Expected token count to retain after compression. The ratio t_retained/t_max determines the compression rate
  • t_summary: Length of the context summary. This parameter takes effect through prompt engineering in your summarizer (if using LLM) and the _compress method
  • tokenizer: Custom Tokenizer (you can settiktoken or transformers.AutoTokenizer here). See context_compressor.tokenizer.TokenCounter for more details.

Citation

Based on the approach described in: Factory.ai: Compressing Context

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_compressor_llm-0.1.0.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context_compressor_llm-0.1.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file context_compressor_llm-0.1.0.tar.gz.

File metadata

  • Download URL: context_compressor_llm-0.1.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for context_compressor_llm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4879c7775e3931d7a26477bca777bf47a2357c30c5fc83cbf41387e67b983e01
MD5 47f2caff640493057d4fdcaa6462c82d
BLAKE2b-256 da90894e65fb05fee8429191c6aabbc40554cb6c5bdcf1c58212f89333e29e34

See more details on using hashes here.

File details

Details for the file context_compressor_llm-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for context_compressor_llm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ab6f269a0065c96d8fcc3d0072201f69cdcb29d81eee0f75e2f52b15da49688
MD5 fea44f86848059d7112ae0a0203bdd35
BLAKE2b-256 a4403c1302b9acb84d5c0edc1b29320c6443970843425b3f558118a32fb478e5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page