Python middleware that compresses and optimizes LLM context before each API call

Project description

ContextPilot

Python middleware that compresses and optimizes LLM context before each API call — lower token costs for apps and agents, with quality scoring and safe fallback. Wraps OpenAI, Anthropic, and Google SDKs with minimal code changes.

Integration Surfaces

ContextPilot deploys wherever you write LLM-powered code. All surfaces share the same compression engine.

Surface	Entry point	Works with
Python library	`pip install contextpilot`	Any Python backend
Local proxy	`contextpilot proxy --port 8432`	Claude Code, GPT Codex, Aider — any tool with a custom base URL
MCP server	`contextpilot mcp`	Claude Desktop, Claude Code
CLI migration	`contextpilot migrate ./src/`	Existing codebases — wraps all LLM calls automatically

Quick Start

pip install contextpilot

# OpenAI
import contextpilot
from openai import OpenAI

client = OpenAI()
pilot = contextpilot.wrap(client)

response = pilot.chat.completions.create(
    model="gpt-4o",
    messages=messages  # compressed transparently
)

# Anthropic
from anthropic import Anthropic

client = Anthropic()
pilot = contextpilot.wrap(client)

response = pilot.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=messages
)

Proxy for AI Coding Tools

Route Claude Code, GPT Codex, or Aider through the compression middleware by setting one environment variable:

contextpilot proxy --port 8432
export ANTHROPIC_BASE_URL=http://localhost:8432

Every subsequent AI coding prompt is now compressed. The coding assistant behaves identically — just uses fewer tokens.

MCP Server

contextpilot mcp

Connects to Claude Desktop and Claude Code. Exposes optimize_context, get_savings, and suggest_config tools. Claude automatically applies compression when context is large — no workflow changes required.

CLI Migration

contextpilot migrate ./src/ --dry-run   # preview changes
contextpilot migrate ./src/ --apply     # refactor in place

Uses AST parsing (not regex) to safely find and wrap all LLM API calls in an existing codebase. Designed for codebases with 50+ LLM calls where manual refactoring is prohibitive.

Configuration

# contextpilot.yaml
compression:
  level: balanced          # conservative | balanced | aggressive
  quality_threshold: 85    # fallback to uncompressed if score drops below this
  history_window: 6        # keep last N conversation turns verbatim
  rag_relevance_min: 0.15  # drop RAG chunks below this relevance score

shadow_testing:
  enabled: true
  sample_rate: 0.05        # 5% of calls sent both compressed and uncompressed

telemetry:
  enabled: true
  endpoint: https://api.contextpilot.dev/v1/telemetry
  api_key: ${CONTEXTPILOT_API_KEY}

Agent Memory Middleware

For multi-agent workflows (LangChain, CrewAI, AutoGen), compress inter-agent handoffs that would otherwise multiply tokens 5–30x:

from contextpilot.middleware import AgentMemory

memory = AgentMemory(
    compression_level="aggressive",
    preserve_keys=["final_answer", "tool_outputs"],
)

agent_a_output = agent_a.run(task)
compressed = memory.compress_handoff(agent_a_output)
agent_b_output = agent_b.run(task, context=compressed)

Privacy

Telemetry transmits numerical metadata only — token counts, latency, quality scores, model IDs, timestamps. No prompt content, no response content, no PII ever leaves your environment.

License

MIT

Project details

Release history Release notifications | RSS feed

This version

0.1.0

May 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contextpilot_ai-0.1.0.tar.gz (89.8 kB view details)

Uploaded May 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

contextpilot_ai-0.1.0-py3-none-any.whl (28.0 kB view details)

Uploaded May 9, 2026 Python 3

File details

Details for the file contextpilot_ai-0.1.0.tar.gz.

File metadata

Download URL: contextpilot_ai-0.1.0.tar.gz
Upload date: May 9, 2026
Size: 89.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for contextpilot_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e6cf34f8a44ce6719221838415cb0561c6925538a9f4b1aceb04f62d7dc9039f`
MD5	`00e87ec943d3953be0fe73d6480671e8`
BLAKE2b-256	`a4e9a03fe7694b984a2945270bde044e2c7a61824afe72dbebcaa5d009131653`

See more details on using hashes here.

File details

Details for the file contextpilot_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: contextpilot_ai-0.1.0-py3-none-any.whl
Upload date: May 9, 2026
Size: 28.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for contextpilot_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba9a56a5a96c488132f29eb80f21c001982f6c2d16d649cef03d95369d421cc3`
MD5	`ba12aeefa1e3ce4997517b12ca1e905f`
BLAKE2b-256	`90b8b9ba6a102761087fe8215d93bcc3c87be7a65f32f2a32c24b8d6a211732b`

See more details on using hashes here.

contextpilot-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ContextPilot

Integration Surfaces

Quick Start

Proxy for AI Coding Tools

MCP Server

CLI Migration

Configuration

Agent Memory Middleware

Privacy

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes