Efficient Large Context Window Management for AI Agents and Frameworks
Project description
Context Reference Store
Efficient Large Context Window Management for AI Agents and Frameworks
Context Reference Store is a high-performance Python library designed to solve the challenge of managing large context windows in Agentic AI applications. It provides intelligent caching, compression, and retrieval mechanisms that significantly reduce memory usage and improve response times for AI agents and frameworks.
Key Features
Core Capabilities
- Intelligent Context Caching: LRU, LFU, and TTL-based eviction policies
- Advanced Compression: 625x faster serialization with 99.99% storage reduction
- Async/Await Support: Non-blocking operations for modern applications
- Multimodal Content: Handle text, images, audio, and video efficiently
- High Performance: Sub-100ms retrieval times for large contexts
Framework Integrations
- 🦜 LangChain: Seamless integration with chat and retrieval chains
- 🕸️ LangGraph: Native support for graph-based agent workflows
- 🦙 LlamaIndex: Vector store and query engine implementations
- 🔧 Composio: Tool integration with secure authentication
Advanced Features
- Performance Monitoring: Real-time metrics and dashboard
- Semantic Analysis: Content similarity and clustering
- Token Optimization: Intelligent context window management
- Persistent Storage: Disk-based caching for large datasets
Quick Start
Installation
# Basic installation
pip install context-reference-store
# With framework integrations
pip install context-reference-store[langchain,langgraph,llamaindex]
# Full installation with all features
pip install context-reference-store[full]
Basic Usage
from context_store import ContextReferenceStore
# Initialize the store
store = ContextReferenceStore(cache_size=100)
# Store context content
context_id = store.store("Your long context content here...")
# Retrieve when needed
content = store.retrieve(context_id)
# Get performance statistics
stats = store.get_stats()
print(f"Hit rate: {stats['hit_rate']:.2%}")
Async Operations
from context_store import AsyncContextReferenceStore
async def main():
async with AsyncContextReferenceStore() as store:
# Store multiple contexts concurrently
context_ids = await store.batch_store_async([
"Context 1", "Context 2", "Context 3"
])
# Retrieve all at once
contents = await store.batch_retrieve_async(context_ids)
Multimodal Content
from context_store import MultimodalContent, MultimodalPart
# Create multimodal content
text_part = MultimodalPart.from_text("Describe this image:")
image_part = MultimodalPart.from_file("path/to/image.jpg")
content = MultimodalContent(parts=[text_part, image_part])
# Store and retrieve
context_id = store.store_multimodal_content(content)
retrieved = store.retrieve_multimodal_content(context_id)
Framework Integration Examples
LangChain Integration
from context_store.adapters import LangChainAdapter
from langchain.schema import HumanMessage, AIMessage
adapter = LangChainAdapter()
# Store conversation
messages = [
HumanMessage(content="Hello!"),
AIMessage(content="Hi there!")
]
session_id = adapter.store_messages(messages, session_id="chat_1")
# Retrieve conversation
retrieved_messages = adapter.retrieve_messages(session_id)
LangGraph Integration
from context_store.adapters import LangGraphAdapter
adapter = LangGraphAdapter()
# Store graph state
state = {"current_step": "analysis", "data": {...}}
state_id = adapter.store_graph_state(state, graph_id="workflow_1")
# Retrieve and continue workflow
restored_state = adapter.retrieve_graph_state(state_id)
LlamaIndex Integration
from context_store.adapters import LlamaIndexAdapter
from llama_index import Document
adapter = LlamaIndexAdapter()
# Store documents with embeddings
docs = [Document(text="Document content...")]
adapter.store_documents(docs, collection="my_docs")
# Query with vector similarity
results = adapter.query("Find similar content", collection="my_docs")
Performance Benchmarks
Our benchmarks show significant improvements over standard approaches:
| Metric | Standard | Context Store | Improvement |
|---|---|---|---|
| Serialization Speed | 2.5s | 4ms | 625x faster |
| Memory Usage | 1.2GB | 24MB | 49x reduction |
| Storage Size | 450MB | 900KB | 99.8% smaller |
| Retrieval Time | 250ms | 15ms | 16x faster |
Configuration Options
Cache Policies
from context_store import CacheEvictionPolicy
# LRU (Least Recently Used)
store = ContextReferenceStore(
cache_size=100,
eviction_policy=CacheEvictionPolicy.LRU
)
# LFU (Least Frequently Used)
store = ContextReferenceStore(
eviction_policy=CacheEvictionPolicy.LFU
)
# TTL (Time To Live)
store = ContextReferenceStore(
eviction_policy=CacheEvictionPolicy.TTL,
ttl_seconds=3600 # 1 hour
)
Compression Settings
# Enable compression for better storage efficiency
store = ContextReferenceStore(
use_compression=True,
compression_algorithm="lz4", # or "zstd"
compression_level=3
)
Storage Configuration
# Configure disk storage for large datasets
store = ContextReferenceStore(
use_disk_storage=True,
disk_cache_dir="/path/to/cache",
memory_threshold_mb=500
)
Monitoring and Analytics
Real-time Dashboard
from context_store.monitoring import TUIDashboard
# Launch interactive dashboard
dashboard = TUIDashboard(store)
dashboard.run() # Opens in terminal
Performance Metrics
# Get detailed statistics
stats = store.get_detailed_stats()
print(f"""
Performance Metrics:
- Cache Hit Rate: {stats['hit_rate']:.2%}
- Average Retrieval Time: {stats['avg_retrieval_time_ms']}ms
- Memory Usage: {stats['memory_usage_mb']}MB
- Compression Ratio: {stats['compression_ratio']:.2f}x
""")
Custom Monitoring
from context_store.monitoring import PerformanceMonitor
monitor = PerformanceMonitor()
store.add_monitor(monitor)
# Access real-time metrics
print(monitor.get_current_metrics())
Advanced Features
Semantic Analysis
from context_store.semantic import SemanticAnalyzer
analyzer = SemanticAnalyzer(store)
# Find similar contexts
similar = analyzer.find_similar_contexts(
"query text",
threshold=0.8,
limit=5
)
# Cluster related contexts
clusters = analyzer.cluster_contexts(method="kmeans", n_clusters=5)
Token Optimization
from context_store.optimization import TokenManager
token_manager = TokenManager(store)
# Optimize context for token limits
optimized = token_manager.optimize_context(
context_id,
max_tokens=4000,
strategy="importance_ranking"
)
Development
Installation for Development
git clone https://github.com/adewaleadenle/context-reference-store.git
cd context-reference-store
pip install -e ".[dev]"
Running Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=context_store
# Run performance benchmarks
pytest -m benchmark
Code Quality
# Format code
black .
isort .
# Lint code
flake8 context_store/
mypy context_store/
Optional Dependencies
The library supports various optional dependencies for enhanced functionality:
# Framework integrations
pip install context-reference-store[langchain] # LangChain support
pip install context-reference-store[langgraph] # LangGraph support
pip install context-reference-store[llamaindex] # LlamaIndex support
pip install context-reference-store[composio] # Composio support
# Performance enhancements
pip install context-reference-store[compression] # Advanced compression
pip install context-reference-store[async] # Async optimizations
# Development tools
pip install context-reference-store[dev] # Testing and linting
pip install context-reference-store[docs] # Documentation tools
# Everything included
pip install context-reference-store[full] # All features
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
Quick Contribution Steps
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📄 License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
🙏 Acknowledgments
- Built for Google Summer of Code 2025 with Google DeepMind
- Inspired by the need for efficient context management in modern AI applications
- Thanks to the open-source AI community for feedback and contributions
📞 Support
- Documentation: https://context-reference-store.readthedocs.io/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Made with ❤️ for the AI community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file context_reference_store-1.0.4-py3-none-any.whl.
File metadata
- Download URL: context_reference_store-1.0.4-py3-none-any.whl
- Upload date:
- Size: 97.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93df913a9ef19f72534facf785d86bfa06e47e87057cc1edbdfdca1dc5f6abbc
|
|
| MD5 |
a9515efa24296206f8e8920fe376f88d
|
|
| BLAKE2b-256 |
089d0fb3e9b71a5748583e21b45a3dcb54f989bec86e8d8af8ebbc0916e285aa
|