Context window optimization for AI agents. Zero dependencies.

These details have not been verified by PyPI

Project links

Project description

antaris-context

Zero-dependency context window optimization for AI agents.

Manage context windows, token budgets, and message compression without external dependencies. Built for production AI agent systems that need deterministic, configurable context management.

Install

pip install antaris-context

Requirements: Python 3.9+, no dependencies.

Quick Start

from antaris_context import ContextManager

# Initialize with 8K token budget
manager = ContextManager(total_budget=8000)

# Set section budgets
manager.set_section_budget('system', 1000)
manager.set_section_budget('memory', 2000) 
manager.set_section_budget('conversation', 4000)
manager.set_section_budget('tools', 1000)

# Add content with priorities
manager.add_content('system', "You are a helpful assistant.", priority='critical')
manager.add_content('memory', "User prefers concise responses.", priority='important')

# Add conversation messages with automatic compression and selection
messages = [
    {'role': 'user', 'content': 'What is Python?'},
    {'role': 'assistant', 'content': 'Python is a programming language...'},
    # ... more messages
]
manager.add_content('conversation', messages, priority='normal')

# Check usage
report = manager.get_usage_report()
print(f"Used: {report['total_used']}/{report['total_budget']} tokens")
print(f"Utilization: {report['utilization']:.1%}")

# Optimize context window
optimization = manager.optimize_context(target_utilization=0.85)
print(f"Optimization successful: {optimization['success']}")

Core Components

ContextManager

The main orchestrator. Manages budgets, applies strategies, and coordinates all optimization.

from antaris_context import ContextManager

# Initialize with configuration file
manager = ContextManager(
    total_budget=8000,
    config_file='context_config.json'
)

# Add content with automatic strategy selection
manager.add_content('conversation', messages, priority='normal')

# Analyze and optimize
analysis = manager.analyze_context()
print(f"Efficiency score: {analysis['efficiency_score']:.2f}")

# Get optimization suggestions
for suggestion in analysis['optimization_suggestions']:
    print(f"- {suggestion['description']}")

Content Selection Strategies

Choose what content to include when space is limited:

# Recency strategy - newest first
manager.set_strategy('recency', prefer_high_priority=True)

# Relevance strategy - keyword matching
manager.set_strategy('relevance', min_score=0.2, case_sensitive=False)

# Hybrid strategy - combine recency and relevance
manager.set_strategy('hybrid', recency_weight=0.4, relevance_weight=0.6)

# Budget strategy - maximize value per token
manager.set_strategy('budget', approach='balanced')

# Use with query context
selected = manager.add_content('conversation', messages, query="Tell me about Python")

Message Compression

Reduce token usage while preserving meaning:

from antaris_context import MessageCompressor

# Configure compression levels
compressor = MessageCompressor('moderate')  # light, moderate, aggressive

# Compress individual messages
compressed = compressor.compress("This    has   lots  of\n\n\nwhitespace")
# Result: "This has lots of whitespace"

# Compress message lists
messages = [
    {'role': 'user', 'content': 'Very long message...'},
    {'role': 'tool', 'content': '500 lines of output...'}
]
compressed_msgs = compressor.compress_message_list(messages, max_content_length=500)

# Tool output compression (keep first/last N lines)
output = compressor.compress_tool_output(long_output, max_lines=20, keep_first=10, keep_last=10)

# Get compression stats
stats = compressor.get_compression_stats()
print(f"Saved {stats['bytes_saved']} bytes ({stats['compression_ratio']:.1%})")

Context Analysis

Understand usage patterns and get optimization advice:

from antaris_context import ContextProfiler

profiler = ContextProfiler(log_file='context_analysis.jsonl')
analysis = profiler.analyze_window(manager.window)

# Section analysis
for section, data in analysis['section_analysis'].items():
    print(f"{section}: {data['utilization']:.1%} utilized, {data['status']}")

# Waste detection
waste = analysis['waste_detection']
print(f"Found {len(waste['waste_items'])} waste sources")
print(f"Total waste: {waste['total_waste_tokens']} tokens")

# Budget reallocation suggestions
suggestions = profiler.suggest_budget_reallocation(manager.window)
for section, budget in suggestions['suggested_budgets'].items():
    print(f"{section}: {budget} tokens (was {suggestions['current_budgets'][section]})")

# Historical trends
trends = profiler.get_historical_trends(days=7)
print(f"Efficiency trend: {trends['efficiency_trend']['current']:.2f}")

Configuration

Use JSON files for persistent configuration:

{
  "compression_level": "moderate",
  "strategy": "hybrid",
  "strategy_params": {
    "recency_weight": 0.4,
    "relevance_weight": 0.6
  },
  "section_budgets": {
    "system": 1000,
    "memory": 2000,
    "conversation": 4000,
    "tools": 1000
  },
  "truncation_strategy": "oldest_first",
  "auto_compress": true,
  "profiler_log_file": "profiler.jsonl"
}

# Load configuration
manager = ContextManager(config_file='config.json')

# Modify and save
manager.set_compression_level('aggressive')
manager.save_config('updated_config.json')

Priority System

Content is prioritized for inclusion:

critical: Never truncated, always included (system prompts, safety constraints)
important: High priority, removed only when necessary (recent context, user preferences)
normal: Standard priority, balanced selection (conversation history)
optional: First to be removed when space is needed (old messages, verbose outputs)

# Add content with priorities
manager.add_content('system', 'Safety: Never generate harmful content', priority='critical')
manager.add_content('memory', 'User likes Python examples', priority='important')
manager.add_content('conversation', 'How do I use decorators?', priority='normal')
manager.add_content('tools', 'Debug output: verbose trace...', priority='optional')

# During truncation, optional content is removed first, critical content never

Truncation Strategies

When content exceeds budget, different strategies decide what to remove:

# Oldest first (default)
manager.config['truncation_strategy'] = 'oldest_first'

# Lowest priority first
manager.config['truncation_strategy'] = 'lowest_priority'

# Smart summary preservation
manager.config['truncation_strategy'] = 'smart_summary_markers'

Token Estimation

Uses character-based approximation (~4 characters per token):

from antaris_context import ContextWindow

window = ContextWindow()
tokens = window._estimate_tokens("Hello world")  # ~3 tokens

# This is an approximation for efficiency
# Real token counts vary by model and tokenizer

Real-World Example

Complete agent context management:

import json
from antaris_context import ContextManager

# Initialize agent context
manager = ContextManager(total_budget=8000)
manager.set_section_budget('system', 800)
manager.set_section_budget('memory', 1200) 
manager.set_section_budget('conversation', 5000)
manager.set_section_budget('tools', 1000)

# Set hybrid strategy for balanced selection
manager.set_strategy('hybrid', recency_weight=0.3, relevance_weight=0.7)

# Add system prompt
system_prompt = """You are a coding assistant. 
Rules:
- Always provide working examples
- Explain complex concepts simply
- Ask clarifying questions when needed"""

manager.add_content('system', system_prompt, priority='critical')

# Add user memory/preferences
memories = [
    "User is learning Python",
    "Prefers concise explanations", 
    "Working on web development project"
]
for memory in memories:
    manager.add_content('memory', memory, priority='important')

# Add conversation history (will be selected by strategy)
conversation = [
    {'role': 'user', 'content': 'How do I create a web API in Python?'},
    {'role': 'assistant', 'content': 'You can use Flask or FastAPI. Here\'s a Flask example:\n\n```python\nfrom flask import Flask\napp = Flask(__name__)\n\n@app.route("/api/hello")\ndef hello():\n    return {"message": "Hello World"}\n\nif __name__ == "__main__":\n    app.run(debug=True)\n```'},
    {'role': 'user', 'content': 'What about authentication?'},
    # ... more messages
]

# Add with query context for relevance scoring
current_query = "How do I add JWT authentication to my Flask API?"
manager.add_content('conversation', conversation, query=current_query)

# Add tool outputs
tool_output = """
Flask-JWT-Extended installed successfully
Dependencies: PyJWT, Flask, Werkzeug
Configuration options:
- JWT_SECRET_KEY: Required
- JWT_ACCESS_TOKEN_EXPIRES: Optional, defaults to 15 minutes  
- JWT_REFRESH_TOKEN_EXPIRES: Optional, defaults to 30 days
"""

manager.add_content('tools', tool_output, priority='normal')

# Optimize for target utilization
optimization = manager.optimize_context(
    query=current_query, 
    target_utilization=0.85
)

print(f"Optimization successful: {optimization['success']}")
print(f"Actions taken: {optimization['actions_taken']}")

# Get final usage report
report = manager.get_usage_report()
print(f"\nFinal utilization: {report['utilization']:.1%}")
print(f"Sections:")
for section, data in report['sections'].items():
    print(f"  {section}: {data['used']}/{data['budget']} tokens ({data['utilization']:.1%})")

# Analyze for insights
analysis = manager.analyze_context()
print(f"\nEfficiency score: {analysis['efficiency_score']:.2f}")
print("Optimization suggestions:")
for suggestion in analysis['optimization_suggestions']:
    print(f"  - {suggestion['description']}")

# Export state for persistence
state = manager.export_state()
with open('agent_context_state.json', 'w') as f:
    f.write(state)

What It Doesn't Do

antaris-context is focused and honest about its limitations:

No actual tokenization: Uses character-based approximation (~4 chars/token). For exact counts, integrate with your model's tokenizer.
No LLM calls: Purely deterministic processing. Relevance scoring uses simple keyword matching, not semantic similarity.
No content generation: Won't summarize or rewrite content. It selects, compresses, and truncates existing content only.
No model-specific optimization: Token estimates work generally but aren't tuned for specific models (GPT-4, Claude, etc).
No automatic learning: Doesn't adapt strategies based on usage patterns. Configuration is explicit and static.
No distributed contexts: Manages single context windows. For multi-agent or distributed scenarios, use multiple managers.
Limited compression: Focuses on whitespace and structural compression, not semantic compression or paraphrasing.

This is intentional. The library does one thing well: deterministic context window management with configurable strategies.

Design Philosophy

Built on principles proven by the antaris-* suite:

Zero dependencies: Only Python stdlib. No version conflicts, minimal security surface.
File-based config: JSON configuration for reproducible behavior across environments.
Deterministic: Same inputs always produce same outputs. No randomness, no API calls.
Honest limitations: Clear about what it does and doesn't do. No overselling.
Production-ready: Designed for real agent systems, not demos or experiments.

Performance

Rough benchmarks on modern hardware:

Token estimation: ~100K characters/second
Message compression: ~50K characters/second
Strategy selection: ~10K messages/second
Context analysis: ~1K content items/second

Memory usage scales linearly with content size. No significant overhead for large contexts.

Comparison

Similar libraries and how antaris-context differs:

Library	Dependencies	Config	Deterministic	Token Counting	Strategies
antaris-context	✅ None	✅ JSON files	✅ Yes	⚠️ Approximation	✅ Pluggable
tiktoken	✅ Minimal	❌ Code only	✅ Yes	✅ Exact	❌ None
langchain	❌ Heavy	❌ Code only	❌ No	✅ Via tiktoken	⚠️ Limited
guidance	❌ Heavy	❌ Code only	⚠️ Partial	✅ Via transformers	❌ None

Choose antaris-context when you need:

Zero-dependency deployment
File-based configuration
Deterministic behavior
Production-ready context management
Pluggable selection strategies

Choose alternatives when you need:

Exact tokenization for specific models
Semantic similarity (use embeddings)
Content summarization (use LLMs)
Complex multi-modal contexts

Contributing

This library is part of the antaris-* suite. Contributions welcome:

Keep zero dependencies
Maintain deterministic behavior
Add tests for new features
Update documentation
Follow existing code style

License

Apache 2.0 - see LICENSE file.

Changelog

See CHANGELOG.md for version history.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

5.0.1

Mar 10, 2026

4.9.20

Mar 8, 2026

4.9.18

Mar 7, 2026

4.9.17

Mar 7, 2026

4.9.16

Mar 6, 2026

4.9.15

Mar 6, 2026

4.9.14

Mar 5, 2026

4.9.13

Mar 5, 2026

4.9.12

Mar 5, 2026

4.9.11

Mar 5, 2026

4.9.10

Mar 4, 2026

4.9.5

Mar 3, 2026

4.9.4

Mar 3, 2026

4.9.3

Mar 3, 2026

4.9.2

Mar 3, 2026

4.9.1

Mar 3, 2026

4.9.0

Mar 3, 2026

4.8.0

Mar 3, 2026

4.7.1

Mar 3, 2026

4.7.0

Mar 3, 2026

4.6.8

Mar 2, 2026

4.6.6

Mar 2, 2026

4.6.5

Mar 2, 2026

4.6.0

Mar 2, 2026

4.5.3

Mar 1, 2026

4.5.2

Mar 1, 2026

4.2.0

Feb 27, 2026

4.1.0

Feb 26, 2026

4.0.0

Feb 23, 2026

3.1.0

Feb 21, 2026

3.0.0

Feb 21, 2026

2.2.0

Feb 21, 2026

2.1.1

Feb 20, 2026

2.0.0

Feb 19, 2026

1.1.0

Feb 17, 2026

This version

1.0.1

Feb 17, 2026

1.0.0

Feb 23, 2026

0.1.0

Feb 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antaris_context-1.0.1.tar.gz (41.9 kB view details)

Uploaded Feb 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

antaris_context-1.0.1-py3-none-any.whl (33.5 kB view details)

Uploaded Feb 17, 2026 Python 3

File details

Details for the file antaris_context-1.0.1.tar.gz.

File metadata

Download URL: antaris_context-1.0.1.tar.gz
Upload date: Feb 17, 2026
Size: 41.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_context-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`483830405919acba89c3c645980da6db0aeac7f37a27103e0a0ed8f699ddcf35`
MD5	`248b04f452e730128418cff756927330`
BLAKE2b-256	`0d256d3c303ce32174d78ec2ba452401c16cb9a3212c78ef06028ab1ba2c9704`

See more details on using hashes here.

File details

Details for the file antaris_context-1.0.1-py3-none-any.whl.

File metadata

Download URL: antaris_context-1.0.1-py3-none-any.whl
Upload date: Feb 17, 2026
Size: 33.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_context-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3e0a7cfb7d11677d1027ad21755b95d9b5e6f87d89614f45a3d18f866ecc6bbe`
MD5	`dc2b9252740e294cf3f09027a47db4ea`
BLAKE2b-256	`67dd1a0ef6c7a38ee04f8ccdbb9ace7eb638b2e202f8ae924dcaf597a3d2ad38`

See more details on using hashes here.

antaris-context 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

antaris-context

Install

Quick Start

Core Components

ContextManager

Content Selection Strategies

Message Compression

Context Analysis

Configuration

Priority System

Truncation Strategies

Token Estimation

Real-World Example

What It Doesn't Do

Design Philosophy

Performance

Comparison

Contributing

License

Changelog

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes