Lightweight context window management for AI agents

These details have not been verified by PyPI

Project links

Project description

agent-context-manager

A lightweight Python library for managing LLM context windows in AI agents. Prevents context overflow, reduces token costs, and maintains conversation coherence.

The Problem

AI agents face a critical challenge: context windows fill up fast. When they overflow:

Costs explode - token usage grows exponentially
Performance degrades - LLMs struggle with long contexts ("lost in the middle")
Coherence breaks - agents forget important context while keeping noise

Current solutions are either too complex (require LLM calls for summarization) or too naive (just truncate old messages).

The Solution

agent-context-manager provides intelligent context compression without requiring additional LLM calls:

Token-aware management - Track usage, warn before overflow
Multiple compression strategies - Choose what fits your use case
Framework agnostic - Works with any LLM provider
Zero LLM dependencies - No API calls needed for compression

Installation

pip install agent-context-manager

Quick Start

from agent_context_manager import ContextManager, SlidingWindowStrategy

# Create a context manager with 8K token limit
manager = ContextManager(
    max_tokens=8000,
    strategy=SlidingWindowStrategy(keep_system=True, keep_recent=10)
)

# Add messages as your agent works
manager.add_message({"role": "system", "content": "You are a helpful assistant."})
manager.add_message({"role": "user", "content": "Hello!"})
manager.add_message({"role": "assistant", "content": "Hi there!"})

# Get compressed context when needed
context = manager.get_context()

# Check token usage
print(f"Tokens used: {manager.token_count}/{manager.max_tokens}")

Compression Strategies

1. Sliding Window (Default)

Keeps the most recent N messages, always preserving system messages.

from agent_context_manager import SlidingWindowStrategy

strategy = SlidingWindowStrategy(
    keep_system=True,      # Always keep system messages
    keep_recent=20,        # Keep last 20 messages
    keep_first_user=True   # Keep the original user request
)

2. Importance Scoring

Scores messages by relevance and keeps the most important ones.

from agent_context_manager import ImportanceStrategy

strategy = ImportanceStrategy(
    system_weight=1.0,     # System messages always kept
    user_weight=0.8,       # User messages high priority
    assistant_weight=0.6,  # Assistant messages medium priority
    tool_weight=0.4,       # Tool results lower priority
    recency_decay=0.95     # Recent messages score higher
)

3. Semantic Deduplication

Removes near-duplicate messages to reduce redundancy.

from agent_context_manager import DeduplicationStrategy

strategy = DeduplicationStrategy(
    similarity_threshold=0.85,  # Remove if >85% similar
    keep_latest=True            # Keep the most recent of duplicates
)

4. Hybrid (Recommended for Production)

Combines multiple strategies for best results.

from agent_context_manager import HybridStrategy

strategy = HybridStrategy([
    DeduplicationStrategy(similarity_threshold=0.9),
    ImportanceStrategy(recency_decay=0.95),
    SlidingWindowStrategy(keep_recent=50)
])

Token Counting

Built-in token counting for popular models:

from agent_context_manager import ContextManager

# Auto-detect tokenizer based on model
manager = ContextManager(max_tokens=8000, model="gpt-4")
manager = ContextManager(max_tokens=100000, model="claude-3")

# Or use a custom tokenizer
manager = ContextManager(
    max_tokens=8000,
    tokenizer=my_custom_tokenizer
)

Overflow Handling

from agent_context_manager import ContextManager, OverflowPolicy

manager = ContextManager(
    max_tokens=8000,
    overflow_policy=OverflowPolicy.COMPRESS,  # Auto-compress when near limit
    overflow_threshold=0.9  # Compress at 90% capacity
)

# Or get warnings instead
manager = ContextManager(
    max_tokens=8000,
    overflow_policy=OverflowPolicy.WARN
)

# Check status
if manager.is_near_overflow():
    print(f"Warning: {manager.usage_percent}% of context used")

Memory Blocks (Structured Context)

Organize context into logical blocks with size limits:

from agent_context_manager import ContextManager, MemoryBlock

manager = ContextManager(max_tokens=8000)

# Define memory blocks
manager.add_block(MemoryBlock(
    name="system",
    max_tokens=500,
    priority=1.0,  # Highest priority, never compressed
    content="You are a helpful coding assistant."
))

manager.add_block(MemoryBlock(
    name="user_profile",
    max_tokens=200,
    priority=0.9,
    content="User prefers Python, uses VS Code."
))

manager.add_block(MemoryBlock(
    name="conversation",
    max_tokens=7000,
    priority=0.5,  # Can be compressed if needed
    strategy=SlidingWindowStrategy(keep_recent=30)
))

# Update blocks as needed
manager.update_block("user_profile", "User prefers Python, uses VS Code, timezone: PST")

Integration Examples

With OpenAI

from openai import OpenAI
from agent_context_manager import ContextManager

client = OpenAI()
manager = ContextManager(max_tokens=8000, model="gpt-4")

manager.add_message({"role": "system", "content": "You are helpful."})

while True:
    user_input = input("You: ")
    manager.add_message({"role": "user", "content": user_input})
    
    response = client.chat.completions.create(
        model="gpt-4",
        messages=manager.get_context()  # Auto-compressed if needed
    )
    
    assistant_message = response.choices[0].message.content
    manager.add_message({"role": "assistant", "content": assistant_message})
    print(f"Assistant: {assistant_message}")

With Anthropic

from anthropic import Anthropic
from agent_context_manager import ContextManager

client = Anthropic()
manager = ContextManager(max_tokens=100000, model="claude-3")

# Same pattern works with any provider

With LangChain

from langchain.memory import ConversationBufferMemory
from agent_context_manager import ContextManager, LangChainAdapter

manager = ContextManager(max_tokens=8000)
memory = LangChainAdapter(manager)  # Drop-in replacement

API Reference

ContextManager

Method	Description
`add_message(msg)`	Add a message to context
`get_context()`	Get compressed context as message list
`token_count`	Current token count
`usage_percent`	Percentage of context used
`is_near_overflow()`	Check if approaching limit
`compress()`	Manually trigger compression
`clear()`	Clear all messages

Strategies

Strategy	Best For
`SlidingWindowStrategy`	Simple agents, chatbots
`ImportanceStrategy`	Complex agents with tool use
`DeduplicationStrategy`	Repetitive workflows
`HybridStrategy`	Production systems

Contributing

Contributions welcome! Please read CONTRIBUTING.md first.

License

MIT License - see LICENSE file.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_context_manager-0.1.0.tar.gz (19.2 kB view details)

Uploaded Feb 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_context_manager-0.1.0-py3-none-any.whl (14.7 kB view details)

Uploaded Feb 5, 2026 Python 3

File details

Details for the file agent_context_manager-0.1.0.tar.gz.

File metadata

Download URL: agent_context_manager-0.1.0.tar.gz
Upload date: Feb 5, 2026
Size: 19.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agent_context_manager-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0bde8c10b47d97624f12f4358e86dd031cb83131b952a41e83fb8b4fd6dbdb83`
MD5	`a12b73f41f5fc4029b212f522d418831`
BLAKE2b-256	`4cae4d5f5bc7b6a74b81685c4e40c3c386ce7fb226fce9eeca57109148317df2`

See more details on using hashes here.

File details

Details for the file agent_context_manager-0.1.0-py3-none-any.whl.

File metadata

Download URL: agent_context_manager-0.1.0-py3-none-any.whl
Upload date: Feb 5, 2026
Size: 14.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agent_context_manager-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c294b946365ca13d401ebcd8f1eccd257e3589c6b5f91cebe8f520469d77ad57`
MD5	`f5e2274d725d5f8f02d978cc1ff6c30f`
BLAKE2b-256	`077384a6609abe0608d5dcfd8510d298909e8a1c9d2e52c9b6f865c5336c8422`

See more details on using hashes here.

agent-context-manager 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agent-context-manager

The Problem

The Solution

Installation

Quick Start

Compression Strategies

1. Sliding Window (Default)

2. Importance Scoring

3. Semantic Deduplication

4. Hybrid (Recommended for Production)

Token Counting

Overflow Handling

Memory Blocks (Structured Context)

Integration Examples

With OpenAI

With Anthropic

With LangChain

API Reference

ContextManager

Strategies

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes