Skip to main content

Python middleware that compresses and optimizes LLM context before each API call

Project description

ContextPilot

Python middleware that compresses and optimizes LLM context before each API call — lower token costs for apps and agents, with quality scoring and safe fallback. Wraps OpenAI, Anthropic, and Google SDKs with minimal code changes.

Integration Surfaces

ContextPilot deploys wherever you write LLM-powered code. All surfaces share the same compression engine.

Surface Entry point Works with
Python library pip install contextpilot Any Python backend
Local proxy contextpilot proxy --port 8432 Claude Code, GPT Codex, Aider — any tool with a custom base URL
MCP server contextpilot mcp Claude Desktop, Claude Code
CLI migration contextpilot migrate ./src/ Existing codebases — wraps all LLM calls automatically

Quick Start

pip install contextpilot
# OpenAI
import contextpilot
from openai import OpenAI

client = OpenAI()
pilot = contextpilot.wrap(client)

response = pilot.chat.completions.create(
    model="gpt-4o",
    messages=messages  # compressed transparently
)
# Anthropic
from anthropic import Anthropic

client = Anthropic()
pilot = contextpilot.wrap(client)

response = pilot.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=messages
)

Proxy for AI Coding Tools

Route Claude Code, GPT Codex, or Aider through the compression middleware by setting one environment variable:

contextpilot proxy --port 8432
export ANTHROPIC_BASE_URL=http://localhost:8432

Every subsequent AI coding prompt is now compressed. The coding assistant behaves identically — just uses fewer tokens.

MCP Server

contextpilot mcp

Connects to Claude Desktop and Claude Code. Exposes optimize_context, get_savings, and suggest_config tools. Claude automatically applies compression when context is large — no workflow changes required.

CLI Migration

contextpilot migrate ./src/ --dry-run   # preview changes
contextpilot migrate ./src/ --apply     # refactor in place

Uses AST parsing (not regex) to safely find and wrap all LLM API calls in an existing codebase. Designed for codebases with 50+ LLM calls where manual refactoring is prohibitive.

Configuration

# contextpilot.yaml
compression:
  level: balanced          # conservative | balanced | aggressive
  quality_threshold: 85    # fallback to uncompressed if score drops below this
  history_window: 6        # keep last N conversation turns verbatim
  rag_relevance_min: 0.15  # drop RAG chunks below this relevance score

shadow_testing:
  enabled: true
  sample_rate: 0.05        # 5% of calls sent both compressed and uncompressed

telemetry:
  enabled: true
  endpoint: https://api.contextpilot.dev/v1/telemetry
  api_key: ${CONTEXTPILOT_API_KEY}

Agent Memory Middleware

For multi-agent workflows (LangChain, CrewAI, AutoGen), compress inter-agent handoffs that would otherwise multiply tokens 5–30x:

from contextpilot.middleware import AgentMemory

memory = AgentMemory(
    compression_level="aggressive",
    preserve_keys=["final_answer", "tool_outputs"],
)

agent_a_output = agent_a.run(task)
compressed = memory.compress_handoff(agent_a_output)
agent_b_output = agent_b.run(task, context=compressed)

Privacy

Telemetry transmits numerical metadata only — token counts, latency, quality scores, model IDs, timestamps. No prompt content, no response content, no PII ever leaves your environment.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contextpilot_ai-0.1.0.tar.gz (89.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

contextpilot_ai-0.1.0-py3-none-any.whl (28.0 kB view details)

Uploaded Python 3

File details

Details for the file contextpilot_ai-0.1.0.tar.gz.

File metadata

  • Download URL: contextpilot_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 89.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for contextpilot_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e6cf34f8a44ce6719221838415cb0561c6925538a9f4b1aceb04f62d7dc9039f
MD5 00e87ec943d3953be0fe73d6480671e8
BLAKE2b-256 a4e9a03fe7694b984a2945270bde044e2c7a61824afe72dbebcaa5d009131653

See more details on using hashes here.

File details

Details for the file contextpilot_ai-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for contextpilot_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ba9a56a5a96c488132f29eb80f21c001982f6c2d16d649cef03d95369d421cc3
MD5 ba12aeefa1e3ce4997517b12ca1e905f
BLAKE2b-256 90b8b9ba6a102761087fe8215d93bcc3c87be7a65f32f2a32c24b8d6a211732b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page