Python middleware that compresses and optimizes LLM context before each API call
Project description
ContextPilot
Python middleware that compresses and optimizes LLM context before each API call — lower token costs for apps and agents, with quality scoring and safe fallback. Wraps OpenAI, Anthropic, and Google SDKs with minimal code changes.
Integration Surfaces
ContextPilot deploys wherever you write LLM-powered code. All surfaces share the same compression engine.
| Surface | Entry point | Works with |
|---|---|---|
| Python library | pip install contextpilot |
Any Python backend |
| Local proxy | contextpilot proxy --port 8432 |
Claude Code, GPT Codex, Aider — any tool with a custom base URL |
| MCP server | contextpilot mcp |
Claude Desktop, Claude Code |
| CLI migration | contextpilot migrate ./src/ |
Existing codebases — wraps all LLM calls automatically |
Quick Start
pip install contextpilot
# OpenAI
import contextpilot
from openai import OpenAI
client = OpenAI()
pilot = contextpilot.wrap(client)
response = pilot.chat.completions.create(
model="gpt-4o",
messages=messages # compressed transparently
)
# Anthropic
from anthropic import Anthropic
client = Anthropic()
pilot = contextpilot.wrap(client)
response = pilot.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
Proxy for AI Coding Tools
Route Claude Code, GPT Codex, or Aider through the compression middleware by setting one environment variable:
contextpilot proxy --port 8432
export ANTHROPIC_BASE_URL=http://localhost:8432
Every subsequent AI coding prompt is now compressed. The coding assistant behaves identically — just uses fewer tokens.
MCP Server
contextpilot mcp
Connects to Claude Desktop and Claude Code. Exposes optimize_context, get_savings, and suggest_config tools. Claude automatically applies compression when context is large — no workflow changes required.
CLI Migration
contextpilot migrate ./src/ --dry-run # preview changes
contextpilot migrate ./src/ --apply # refactor in place
Uses AST parsing (not regex) to safely find and wrap all LLM API calls in an existing codebase. Designed for codebases with 50+ LLM calls where manual refactoring is prohibitive.
Configuration
# contextpilot.yaml
compression:
level: balanced # conservative | balanced | aggressive
quality_threshold: 85 # fallback to uncompressed if score drops below this
history_window: 6 # keep last N conversation turns verbatim
rag_relevance_min: 0.15 # drop RAG chunks below this relevance score
shadow_testing:
enabled: true
sample_rate: 0.05 # 5% of calls sent both compressed and uncompressed
telemetry:
enabled: true
endpoint: https://api.contextpilot.dev/v1/telemetry
api_key: ${CONTEXTPILOT_API_KEY}
Agent Memory Middleware
For multi-agent workflows (LangChain, CrewAI, AutoGen), compress inter-agent handoffs that would otherwise multiply tokens 5–30x:
from contextpilot.middleware import AgentMemory
memory = AgentMemory(
compression_level="aggressive",
preserve_keys=["final_answer", "tool_outputs"],
)
agent_a_output = agent_a.run(task)
compressed = memory.compress_handoff(agent_a_output)
agent_b_output = agent_b.run(task, context=compressed)
Privacy
Telemetry transmits numerical metadata only — token counts, latency, quality scores, model IDs, timestamps. No prompt content, no response content, no PII ever leaves your environment.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file contextpilot_ai-0.1.0.tar.gz.
File metadata
- Download URL: contextpilot_ai-0.1.0.tar.gz
- Upload date:
- Size: 89.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6cf34f8a44ce6719221838415cb0561c6925538a9f4b1aceb04f62d7dc9039f
|
|
| MD5 |
00e87ec943d3953be0fe73d6480671e8
|
|
| BLAKE2b-256 |
a4e9a03fe7694b984a2945270bde044e2c7a61824afe72dbebcaa5d009131653
|
File details
Details for the file contextpilot_ai-0.1.0-py3-none-any.whl.
File metadata
- Download URL: contextpilot_ai-0.1.0-py3-none-any.whl
- Upload date:
- Size: 28.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba9a56a5a96c488132f29eb80f21c001982f6c2d16d649cef03d95369d421cc3
|
|
| MD5 |
ba12aeefa1e3ce4997517b12ca1e905f
|
|
| BLAKE2b-256 |
90b8b9ba6a102761087fe8215d93bcc3c87be7a65f32f2a32c24b8d6a211732b
|