Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse
Project description
| Documentation | Examples | Benchmarks |
News
- [2026/01] ContextPilot has been accepted to MLSys 2026 🎉! See you in Bellevue, WA, USA.
- [2026/01] Code is released!
About
ContextPilot is a fast optimization system on context engineering layer for agentic workloads:
- High Throughput: Boosting prefill throughput and prefix cache hit ratio with intelligent context reuse.
- Accuracy Preserved: Accuracy loss is negligible and even improved!
- Strong Compatibility: Strong compatibility with existing popular RAG libraries (PageIndex), Agentic memory layer (Mem0), KV cache optimization engine (LMCache), and Inference engines (vLLM and SGLang). Both single-node and multi-node deployment!
- Widely Tested: Tested with a wide range of RAG and Agentic AI applications.
Target Workloads
- Trending Topic QA with Retrieval — Search and generation for breaking news and hot topics beyond model knowledge
- Closed-Domain Long-Context QA — Retrieval-augmented QA over specialized corpora (novels, financial reports, legal documents)
- Multi-Turn Conversations with Long-Term Memory — Persistent context across sessions (e.g. Mem0)
Benchmark and Performance
System Performance
ContextPilot on DeepSeek-R1 maintains accuracy compared to SGLang, achieving 64.68% vs 64.15% F1 on MultihopRAG and 41.08% vs 40.20% F1 on NarrativeQA.
Accuracy on MT-RAG Benchmark
| Method | Qwen3-4B | Llama3.1-8B | Qwen3-30B-A3B |
|---|---|---|---|
| LMCache | 62.56 | 68.46 | 75.12 |
| CacheBlend | 50.33 | 56.52 | X |
| RadixCache | 62.56 | 68.46 | 75.12 |
| ContextPilot | 64.27 | 68.12 | 75.81 |
ContextPilot delivers 4-13x improvements in cache hit rates and 1.5-3.5x reductions in prefill latency for large-batch RAG workloads, while maintaining or improving accuracy.
Furthermore, ContextPilot has been tested to reduce input token costs by around 36% with GPT-5.2.
See Benchmarks in the documentation for GPU vs CPU performance analysis and detailed benchmark methodology.
Getting Started
Installation
Requirements: Python >= 3.10
pip install contextpilot
Or install from source:
git clone https://github.com/SecretSettler/ContextPilot.git
cd ContextPilot
pip install -e .
More detailed installation instructions are available in the docs.
Documentation
Check out the ContextPilot documentation for comprehensive guides.
Examples
Go hands-on with our examples, demonstrating how to address different use cases with ContextPilot.
Contributing
We welcome and value all contributions! Please feel free to submit issues and pull requests.
Citation
We will include the paper citation soon!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file contextpilot-0.3.1.tar.gz.
File metadata
- Download URL: contextpilot-0.3.1.tar.gz
- Upload date:
- Size: 119.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab49c8cc80e1947227be536d11d6c1d974ac78db6cb5fbea6f878f364d94c8d7
|
|
| MD5 |
92604f59a25a1d9bed0d70876e59118a
|
|
| BLAKE2b-256 |
7db377b4edfd18592b3cd7e0188ac4222e7b2c3d9b36a8481be3a217297aff43
|
File details
Details for the file contextpilot-0.3.1-py3-none-any.whl.
File metadata
- Download URL: contextpilot-0.3.1-py3-none-any.whl
- Upload date:
- Size: 100.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5986434d5409609187e912e902689b6598eed86e90390c3ad21b4bab7c89b468
|
|
| MD5 |
db7e2fa1b461da3cd53f0878ee28a6f9
|
|
| BLAKE2b-256 |
cef1855805e4e00b4243ecb75e461eab39d219a4502b4920ee3cad001a622b3f
|