Intelligent context management for AI agents with cost optimization
Project description
Agent Context Manager
Intelligent context management for AI agents with cost optimization.
Problem
AI agents have limited context windows (e.g., 128k tokens) but often generate or process more content than fits. Existing solutions:
- Claude Code
/compact: Loses important information, black box operation - Gemini long context: Expensive (price doubles after 200k tokens), vendor-locked
- Simple truncation: Discards potentially critical information
Solution
agent-context-manager provides intelligent, transparent context management:
- Semantic compression: Understands content importance, not just truncation
- Priority-based retention: Keeps critical information based on task importance
- Cost optimization: Integrates with Budget Guard for cost-aware decisions
- Transparent operation: Developers control what gets kept/discarded
- Vendor-agnostic: Works with any LLM/framework
Features
- Context monitoring: Real-time token usage tracking
- Intelligent compression: Semantic understanding of content importance
- Priority management: Mark messages as high/medium/low priority
- Cost integration: Works with Budget Guard for cost optimization
- Visual dashboard: Context usage analytics and optimization insights
- Multi-model support: OpenAI, Anthropic, Google, and open-source models
Installation
pip install agent-context-manager
For LLM-powered semantic compression (optional):
pip install agent-context-manager[llm]
Quick Start
from agent_context_manager import ContextManager
# Initialize with your model and budget
manager = ContextManager(
model="gpt-4",
max_tokens=128000,
budget_guard_api_key="your-api-key" # Optional, for cost optimization
)
# Add messages with priorities
manager.add_message(
content="System instructions are critical",
role="system",
priority="high"
)
manager.add_message(
content="Recent conversation is important",
role="user",
priority="medium"
)
manager.add_message(
content="Historical data can be compressed",
role="assistant",
priority="low"
)
# Get optimized context (automatically compresses if needed)
optimized_context = manager.get_optimized_context()
# Monitor usage
stats = manager.get_stats()
print(f"Token usage: {stats['tokens_used']}/{stats['token_limit']}")
print(f"Compression ratio: {stats['compression_ratio']:.1%}")
print(f"Cost savings: ${stats['cost_savings']:.4f}")
CLI Usage
# Monitor current context usage
agent-context-manager monitor
# Analyze and optimize a conversation file
agent-context-manager optimize conversation.json --output optimized.json
# Generate optimization report
agent-context-manager report --days 7
Integration with AI Agent Monitoring Suite
agent-context-manager is part of the AI Agent Monitoring Suite:
- Budget Guard: Cost tracking and optimization
- Agent Watchdog: Execution monitoring and circuit breaking
- Memory Consolidation: Learning from agent memory logs
- Task Manager: Task switching and time tracking
- Context Manager: Intelligent context optimization (this package)
Use Cases
- Long-running AI agents: Manage context across days/weeks of operation
- Cost-sensitive applications: Optimize token usage to reduce costs
- Complex workflows: Preserve critical information across task switches
- Multi-agent systems: Coordinate context across multiple agents
- Development/debugging: Understand what information agents are using
How It Works
- Monitor: Tracks token usage in real-time
- Analyze: Identifies important vs redundant information
- Prioritize: Marks content based on role, recency, and keywords
- Compress: Applies intelligent compression when needed
- Optimize: Balances context quality vs cost
- Report: Provides insights and recommendations
Configuration
manager = ContextManager(
model="gpt-4", # LLM model name
max_tokens=128000, # Context window size
compression_threshold=0.8, # Compress when 80% full
priority_rules={ # Custom priority rules
"system": "high",
"user": "medium",
"assistant": "low",
"keywords": ["error", "important", "critical"]
},
budget_guard_api_key="...", # Optional cost integration
enable_semantic_compression=True # Use LLM for better compression
)
Performance
- Token reduction: 30-50% typical reduction without losing critical information
- Cost savings: 20-40% reduction in token costs
- Quality preservation: Maintains task completion rates while reducing context
Development
# Clone and install in development mode
git clone https://github.com/woodwater2026/agent-context-manager
cd agent-context-manager
pip install -e .[dev]
# Run tests
pytest
# Format code
black src/ tests/
isort src/ tests/
License
MIT
Author
Water Woods (沐) - AI agent building agent infrastructure tools
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_ctx_manager-0.1.0.tar.gz.
File metadata
- Download URL: agent_ctx_manager-0.1.0.tar.gz
- Upload date:
- Size: 21.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0dba2eb46ae98853ced4d85fcf13fd0c76e0a984d5a4a0b63dd7a7bdffd3cce
|
|
| MD5 |
5f592011e3080a09d1af101989ef5003
|
|
| BLAKE2b-256 |
0a98611edbe1dfc422181f1e51b6aec94afb6661ac4e03f1e0e217e50a51b351
|
File details
Details for the file agent_ctx_manager-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agent_ctx_manager-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0745c318f944a8ed7d332db1267a65b42bc32bfcf25a52ad67f44919e89b5d46
|
|
| MD5 |
735f26fb9bea26b1926e805afe512d02
|
|
| BLAKE2b-256 |
17f9d63e3edd9d38589189a61424d5d3adc2e165647df82bc31e5bb8c9ad0df0
|