Python SDK for Compresr - Intelligent prompt compression service
Project description
Compresr Python SDK
Intelligent context compression service to optimize LLM costs and performance. Reduce your LLM API costs by 30-70% through intelligent context compression.
Installation
pip install compresr
Quick Start
API Key Setup
Get your API key from compresr.ai:
- Create an account at compresr.ai
- Navigate to Dashboard → API Keys
- Click "Create New Key" and copy it (shown only once!)
Two Types of Compression
1. Agnostic Compression (No Question Needed)
Use CompressionClient for general-purpose compression:
from compresr import CompressionClient
client = CompressionClient(api_key="cmp_your_api_key")
result = client.compress(
context="Your very long context that needs compression...",
compression_model_name="A_CMPRSR_V1" # or "A_CMPRSR_V1_FLASH" for speed
)
print(f"Original: {result.data.original_tokens} tokens")
print(f"Compressed: {result.data.compressed_tokens} tokens")
print(f"Saved: {result.data.tokens_saved} tokens")
2. Question-Specific Compression
Use QSCompressionClient to compress based on a specific question:
from compresr import QSCompressionClient
client = QSCompressionClient(api_key="cmp_your_api_key")
result = client.compress(
context="Python was created in 1991. JavaScript in 1995. Java in 1995.",
question="Who created Python?",
compression_model_name="QS_CMPRSR_V1"
)
print(f"Compressed (question-relevant): {result.data.compressed_context}")
print(f"Saved: {result.data.tokens_saved} tokens")
Integration with OpenAI
Agnostic compression:
from compresr import CompressionClient
from openai import OpenAI
compresr = CompressionClient(api_key="cmp_xxx")
openai_client = OpenAI(api_key="sk-xxx")
compressed = compresr.compress(
context="Your long system prompt or document...",
compression_model_name="A_CMPRSR_V1"
)
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": compressed.data.compressed_context},
{"role": "user", "content": "Analyze this data..."}
]
)
print(f"Saved {compressed.data.tokens_saved} tokens!")
Question-specific compression (RAG/QA):
from compresr import QSCompressionClient
from openai import OpenAI
compresr = QSCompressionClient(api_key="cmp_xxx")
openai_client = OpenAI(api_key="sk-xxx")
user_question = "What is machine learning?"
# Compress retrieval results based on the question
compressed = compresr.compress(
context="Retrieved documents with lots of information...",
question=user_question,
compression_model_name="QS_CMPRSR_V1"
)
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": compressed.data.compressed_context},
{"role": "user", "content": user_question}
]
)
Streaming Support
Both clients support real-time streaming:
from compresr import CompressionClient, QSCompressionClient
# Agnostic streaming
client = CompressionClient(api_key="cmp_your_api_key")
for chunk in client.compress_stream(
context="Your long context...",
compression_model_name="A_CMPRSR_V1"
):
print(chunk.content, end="", flush=True)
# Question-specific streaming
qs_client = QSCompressionClient(api_key="cmp_your_api_key")
for chunk in qs_client.compress_stream(
context="Your long context...",
question="What is important here?",
compression_model_name="QS_CMPRSR_V1"
):
print(chunk.content, end="", flush=True)
Async Support
Both clients support async/await:
import asyncio
from compresr import CompressionClient, QSCompressionClient
async def main():
# Agnostic async
client = CompressionClient(api_key="cmp_your_api_key")
result = await client.compress_async(
context="Your context...",
compression_model_name="A_CMPRSR_V1"
)
# Question-specific async
qs_client = QSCompressionClient(api_key="cmp_your_api_key")
qs_result = await qs_client.compress_async(
context="Your context...",
question="What matters here?",
compression_model_name="QS_CMPRSR_V1"
)
await client.close()
await qs_client.close()
asyncio.run(main())
Batch Processing
Both clients support batch processing:
from compresr import CompressionClient, QSCompressionClient
# Agnostic batch
client = CompressionClient(api_key="cmp_your_api_key")
results = client.compress_batch(
contexts=["First context...", "Second context..."],
compression_model_name="A_CMPRSR_V1"
)
# Question-specific batch
qs_client = QSCompressionClient(api_key="cmp_your_api_key")
qs_results = qs_client.compress_batch(
contexts=["Context 1...", "Context 2..."],
questions=["Question 1?", "Question 2?"],
compression_model_name="QS_CMPRSR_V1"
)
print(f"Total tokens saved: {results.data.total_tokens_saved}")
API Reference
Client Initialization
from compresr import CompressionClient, QSCompressionClient
# Agnostic compression
client = CompressionClient(
api_key="cmp_your_api_key", # Required
timeout=30 # Optional: request timeout in seconds
)
# Question-specific compression
qs_client = QSCompressionClient(
api_key="cmp_your_api_key", # Required
timeout=30 # Optional: request timeout in seconds
)
Note: BASE_URL is fixed to https://api.compresr.ai and cannot be changed.
Methods
Both CompressionClient and QSCompressionClient support:
| Method | Description |
|---|---|
compress() |
Compress single context (QS requires question param) |
compress_async() |
Async compress |
compress_batch() |
Batch compress (QS requires questions list) |
compress_stream() |
Stream compression |
Response Structure
# CompressionResult
result.data.original_context # Original input
result.data.compressed_context # Compressed output
result.data.original_tokens # Token count before
result.data.compressed_tokens # Token count after
result.data.actual_compression_ratio # Achieved ratio (0-1)
result.data.tokens_saved # Tokens saved
result.data.duration_ms # Processing time
# BatchResult
results.data.total_original_tokens
results.data.total_compressed_tokens
results.data.total_tokens_saved
results.data.average_compression_ratio
results.data.count
results.data.results # List of CompressionResult
Available Models
Agnostic Models (CompressionClient)
| Model | Description | Use Case |
|---|---|---|
A_CMPRSR_V1 |
LLM-based abstractive compression (default) | General purpose, best quality |
A_CMPRSR_V1_FLASH |
Fast extractive compression | Speed-critical applications |
Question-Specific Models (QSCompressionClient)
| Model | Description | Use Case |
|---|---|---|
QS_CMPRSR_V1 |
Question-specific compression, Abstractive (default) | General purpose |
QSR_CMPRSR_V1 |
Question-specific Extractive | General purpose |
Error Handling
Both clients use the same exception handling:
from compresr import CompressionClient, QSCompressionClient
from compresr.exceptions import (
CompresrError,
AuthenticationError,
RateLimitError,
ValidationError,
)
client = CompressionClient(api_key="cmp_your_api_key")
try:
result = client.compress(
context="Your context...",
compression_model_name="A_CMPRSR_V1"
)
except AuthenticationError:
print("Invalid API key")
except RateLimitError:
print("Rate limit exceeded")
except ValidationError as e:
print(f"Invalid request: {e}")
except CompresrError as e:
print(f"API error: {e}")
Requirements
- Python 3.9+
httpx >= 0.27.0pydantic >= 2.10.0
License
Proprietary License
Support
- Documentation: compresr.ai/docs/overview
- Support: support@compresr.ai
- Issues: GitHub Issues
- Website: compresr.ai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file compresr-1.1.1.tar.gz.
File metadata
- Download URL: compresr-1.1.1.tar.gz
- Upload date:
- Size: 15.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48e1bbf16da91c0bf58ea312107283612d812cc99f055b9c6e8bcafaffe25e12
|
|
| MD5 |
343645b7039e82389054350c26dbd4b9
|
|
| BLAKE2b-256 |
7c8807bca40253c276cef998c67abe0951edf4087125021cafb074da4690093a
|
File details
Details for the file compresr-1.1.1-py3-none-any.whl.
File metadata
- Download URL: compresr-1.1.1-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
387788d719f66bdcb3793a39c88af1ad1e6c376b80019af43216ecdda11751ac
|
|
| MD5 |
e89683534dbc928a9df29aabdeeb8d33
|
|
| BLAKE2b-256 |
cfd273b4e30184d2b53c491d2102305043564abd9b693b6e5ba24814a04973b2
|