Skip to main content

Python SDK for Compresr - Intelligent prompt compression service

Project description

Compresr Python SDK

Intelligent context compression service to optimize LLM costs and performance. Reduce your LLM API costs by 30-70% through intelligent context compression.

Installation

pip install compresr

Quick Start

API Key Setup

Get your API key from compresr.ai:

  1. Create an account at compresr.ai
  2. Navigate to Dashboard → API Keys
  3. Click "Create New Key" and copy it (shown only once!)

Two Types of Compression

1. Agnostic Compression (No Question Needed)

Use CompressionClient for general-purpose compression:

from compresr import CompressionClient

client = CompressionClient(api_key="cmp_your_api_key")

result = client.compress(
    context="Your very long context that needs compression...",
    compression_model_name="A_CMPRSR_V1"  # or "A_CMPRSR_V1_FLASH" for speed
)

print(f"Original: {result.data.original_tokens} tokens")
print(f"Compressed: {result.data.compressed_tokens} tokens")
print(f"Saved: {result.data.tokens_saved} tokens")

2. Question-Specific Compression

Use QSCompressionClient to compress based on a specific question:

from compresr import QSCompressionClient

client = QSCompressionClient(api_key="cmp_your_api_key")

result = client.compress(
    context="Python was created in 1991. JavaScript in 1995. Java in 1995.",
    question="Who created Python?",
    compression_model_name="QS_CMPRSR_V1"
)

print(f"Compressed (question-relevant): {result.data.compressed_context}")
print(f"Saved: {result.data.tokens_saved} tokens")

Integration with OpenAI

Agnostic compression:

from compresr import CompressionClient
from openai import OpenAI

compresr = CompressionClient(api_key="cmp_xxx")
openai_client = OpenAI(api_key="sk-xxx")

compressed = compresr.compress(
    context="Your long system prompt or document...",
    compression_model_name="A_CMPRSR_V1"
)

response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": compressed.data.compressed_context},
        {"role": "user", "content": "Analyze this data..."}
    ]
)

print(f"Saved {compressed.data.tokens_saved} tokens!")

Question-specific compression (RAG/QA):

from compresr import QSCompressionClient
from openai import OpenAI

compresr = QSCompressionClient(api_key="cmp_xxx")
openai_client = OpenAI(api_key="sk-xxx")

user_question = "What is machine learning?"

# Compress retrieval results based on the question
compressed = compresr.compress(
    context="Retrieved documents with lots of information...",
    question=user_question,
    compression_model_name="QS_CMPRSR_V1"
)

response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": compressed.data.compressed_context},
        {"role": "user", "content": user_question}
    ]
)

Streaming Support

Both clients support real-time streaming:

from compresr import CompressionClient, QSCompressionClient

# Agnostic streaming
client = CompressionClient(api_key="cmp_your_api_key")
for chunk in client.compress_stream(
    context="Your long context...",
    compression_model_name="A_CMPRSR_V1"
):
    print(chunk.content, end="", flush=True)

# Question-specific streaming
qs_client = QSCompressionClient(api_key="cmp_your_api_key")
for chunk in qs_client.compress_stream(
    context="Your long context...",
    question="What is important here?",
    compression_model_name="QS_CMPRSR_V1"
):
    print(chunk.content, end="", flush=True)

Async Support

Both clients support async/await:

import asyncio
from compresr import CompressionClient, QSCompressionClient

async def main():
    # Agnostic async
    client = CompressionClient(api_key="cmp_your_api_key")
    result = await client.compress_async(
        context="Your context...",
        compression_model_name="A_CMPRSR_V1"
    )
    
    # Question-specific async
    qs_client = QSCompressionClient(api_key="cmp_your_api_key")
    qs_result = await qs_client.compress_async(
        context="Your context...",
        question="What matters here?",
        compression_model_name="QS_CMPRSR_V1"
    )
    
    await client.close()
    await qs_client.close()

asyncio.run(main())

Batch Processing

Both clients support batch processing:

from compresr import CompressionClient, QSCompressionClient

# Agnostic batch
client = CompressionClient(api_key="cmp_your_api_key")
results = client.compress_batch(
    contexts=["First context...", "Second context..."],
    compression_model_name="A_CMPRSR_V1"
)

# Question-specific batch
qs_client = QSCompressionClient(api_key="cmp_your_api_key")
qs_results = qs_client.compress_batch(
    contexts=["Context 1...", "Context 2..."],
    questions=["Question 1?", "Question 2?"],
    compression_model_name="QS_CMPRSR_V1"
)

print(f"Total tokens saved: {results.data.total_tokens_saved}")

API Reference

Client Initialization

from compresr import CompressionClient, QSCompressionClient

# Agnostic compression
client = CompressionClient(
    api_key="cmp_your_api_key",  # Required
    timeout=30                    # Optional: request timeout in seconds
)

# Question-specific compression
qs_client = QSCompressionClient(
    api_key="cmp_your_api_key",  # Required
    timeout=30                    # Optional: request timeout in seconds
)

Note: BASE_URL is fixed to https://api.compresr.ai and cannot be changed.

Methods

Both CompressionClient and QSCompressionClient support:

Method Description
compress() Compress single context (QS requires question param)
compress_async() Async compress
compress_batch() Batch compress (QS requires questions list)
compress_stream() Stream compression

Response Structure

# CompressionResult
result.data.original_context      # Original input
result.data.compressed_context    # Compressed output
result.data.original_tokens       # Token count before
result.data.compressed_tokens     # Token count after
result.data.actual_compression_ratio  # Achieved ratio (0-1)
result.data.tokens_saved          # Tokens saved
result.data.duration_ms           # Processing time

# BatchResult
results.data.total_original_tokens
results.data.total_compressed_tokens
results.data.total_tokens_saved
results.data.average_compression_ratio
results.data.count
results.data.results              # List of CompressionResult

Available Models

Agnostic Models (CompressionClient)

Model Description Use Case
A_CMPRSR_V1 LLM-based abstractive compression (default) General purpose, best quality
A_CMPRSR_V1_FLASH Fast extractive compression Speed-critical applications

Question-Specific Models (QSCompressionClient)

Model Description Use Case
QS_CMPRSR_V1 Question-specific compression, Abstractive (default) General purpose
QSR_CMPRSR_V1 Question-specific Extractive General purpose

Error Handling

Both clients use the same exception handling:

from compresr import CompressionClient, QSCompressionClient
from compresr.exceptions import (
    CompresrError,
    AuthenticationError,
    RateLimitError,
    ValidationError,
)

client = CompressionClient(api_key="cmp_your_api_key")

try:
    result = client.compress(
        context="Your context...",
        compression_model_name="A_CMPRSR_V1"
    )
except AuthenticationError:
    print("Invalid API key")
except RateLimitError:
    print("Rate limit exceeded")
except ValidationError as e:
    print(f"Invalid request: {e}")
except CompresrError as e:
    print(f"API error: {e}")

Requirements

  • Python 3.9+
  • httpx >= 0.27.0
  • pydantic >= 2.10.0

License

Proprietary License

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

compresr-1.1.0.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

compresr-1.1.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file compresr-1.1.0.tar.gz.

File metadata

  • Download URL: compresr-1.1.0.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for compresr-1.1.0.tar.gz
Algorithm Hash digest
SHA256 f52c58a692c5fb1f7fea6c98c5f5719e17fbb884e06725898cff0ea2a429ce50
MD5 0a58e21013bb77f6ec2562b233bd3a81
BLAKE2b-256 2e5249478de847a05a696baa68ddc100b977a14f9ae7e6b51b9b78c8aa10d591

See more details on using hashes here.

File details

Details for the file compresr-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: compresr-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for compresr-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 66695ecb95bc667967129fb93747f69b75bb3a5a4dd184f064904b8b7209ad5a
MD5 0c14f8421bc605613854f00994a4259c
BLAKE2b-256 90529efe87d9fc0c595e28a51fb94626b818f5d00888358d3645b178efc960d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page