Keep callm and process thousands of requests without (rate) limits.

These details have not been verified by PyPI

Project description

callm

Keep callm and process thousands of requests without (rate) limits

Installation • Quick Start • Providers • Examples • Contributing

😌 Why callm?

Building LLM-powered applications often means processing thousands of API requests. You've probably experienced:

Problem	Without callm	With callm
Rate limit errors	Constant 429 errors, manual sleep/retry	Automatic RPM & TPM throttling
Retry logic	Write custom backoff for each project	Built-in exponential backoff with jitter
Token tracking	No visibility into usage	Real-time token consumption metrics
Boilerplate code	Copy-paste the same async code everywhere	One function call, any provider
Waiting for batch APIs	Provider batch APIs take up to 24 hours	Results in minutes, not hours
Multiple SDKs	Install openai, anthropic, cohere, ...	One library, all providers

Stop rewriting the same parallel processing code. callm handles the infrastructure so you can focus on your application.

Testing multiple providers? Just swap the provider class—no new dependencies, no code changes. Find what works best for your use case.

Installation

pip install callm-py

From source:

git clone https://github.com/milistu/callm.git
cd callm
pip install -e .

Quick Start

Process 1,000 product descriptions to extract structured data—in under a minute:

import asyncio
from callm import process_requests, RateLimitConfig
from callm.providers import OpenAIProvider

# Configure your provider
provider = OpenAIProvider(
    api_key="sk-...",
    model="gpt-5-mini",
    request_url="https://api.openai.com/v1/responses",
)

# Your data processing requests
products = [
    {"id": 1, "description": "Nike Air Max 90 - Classic sneakers in white/black, size 10"},
    {"id": 2, "description": "Sony WH-1000XM5 Wireless Headphones - Noise cancelling, 30hr battery"},
    # ... thousands more
]

requests = [
    {
        "input": f"Extract brand, category, and key features from: {p['description']}",
        "metadata": {"product_id": p["id"]},
    }
    for p in products
]

async def main():
    results = await process_requests(
        provider=provider,
        requests=requests,
        rate_limit=RateLimitConfig(
            max_requests_per_minute=5_000,    # Stay under your tier limit
            max_tokens_per_minute=2_000_000,
        ),
    )

    print(f"Processed {results.stats.successful} requests in {results.stats.duration_seconds:.1f}s")
    print(f"Tokens used: {results.stats.total_input_tokens + results.stats.total_output_tokens:,}")

    # Access results
    for result in results.successes:
        print(f"Product {result.metadata['product_id']}: {result.response}")

asyncio.run(main())

Features

Precise Rate Limiting — Token buckets for RPM and TPM, respects provider limits
Smart Retries — Exponential backoff with jitter, automatic 429/5xx handling
Usage Tracking — Metrics for input tokens and output tokens
Flexible I/O — Process from Python lists or JSONL files, output to memory or disk
Structured Outputs — Support for Pydantic models and JSON schemas
Provider Agnostic — Same API across OpenAI, Anthropic, Gemini, DeepSeek, and more

Supported Providers

OpenAI _{Chat, Responses, Embeddings}	Anthropic _{Messages API}	Gemini _{Generate, Embeddings}
DeepSeek _{Chat Completions}	Cohere _{Embed API}	Voyage AI _Embeddings

Examples

Explore real-world use cases in the examples/ directory:

Use Case	Description
Data Extraction	Extract structured data from product listings, invoices
Embeddings	Generate embeddings for RAG and semantic search
Evaluation	Multi-judge consensus evaluation
Synthetic Data	Generate training data and evaluation sets
Classification	Sentiment analysis, content moderation
Translation	Dataset translation for multilingual evaluation

Processing Modes

callm supports four processing modes depending on your input source and output destination:

Input	Output	Best For
Python list	In-memory	Small batches, interactive use
Python list	JSONL file	Medium batches, need persistence
JSONL file	JSONL file	Large batches, low memory
JSONL file	In-memory	Loading saved requests, testing

# 1. List → Memory (small batches)
results = await process_requests(
    provider=provider,
    requests=my_list,
    rate_limit=rate_limit,
)
# Access: results.successes, results.failures

# 2. List → File (persist results)
results = await process_requests(
    provider=provider,
    requests=my_list,
    rate_limit=rate_limit,
    output_path="results.jsonl",
)

# 3. File → File (large batches, low memory)
results = await process_requests(
    provider=provider,
    requests="input.jsonl",
    rate_limit=rate_limit,
    output_path="results.jsonl",
)

# 4. File → Memory (reload saved requests)
results = await process_requests(
    provider=provider,
    requests="input.jsonl",
    rate_limit=rate_limit,
)

Configuration

from callm import RateLimitConfig, RetryConfig

# Rate limiting (required)
rate_limit = RateLimitConfig(
    max_requests_per_minute=1000,
    max_tokens_per_minute=100_000,
)

# Retry behavior (optional, sensible defaults)
retry = RetryConfig(
    max_attempts=5,
    base_delay_seconds=0.5,
    max_delay_seconds=15.0,
    jitter=0.1,
)

results = await process_requests(
    provider=provider,
    requests=requests,
    rate_limit=rate_limit,
    retry=retry,
)

API Reference

`process_requests()`

Main function for parallel API request processing.

Parameter	Type	Description
`provider`	`BaseProvider`	Provider instance (OpenAI, Anthropic, etc.)
`requests`	`list[dict] \| str`	List of request dicts or path to JSONL file
`rate_limit`	`RateLimitConfig`	RPM and TPM limits
`retry`	`RetryConfig`	Optional retry configuration
`output_path`	`str`	Optional path for output JSONL (enables streaming)
`errors_path`	`str`	Optional path for error JSONL
`logging_level`	`int`	Logging verbosity (default: 20/INFO)

Returns: ProcessingResults with successes, failures, and stats.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

# Setup development environment
git clone https://github.com/milistu/callm.git
cd callm
uv sync --dev
uv run pre-commit install

# Run tests
uv run nox

License

MIT License - see LICENSE for details.

_{Built with 🧡 for engineers who process data at scale}

Project details

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Developers
Programming Language

Release history Release notifications | RSS feed

This version

0.1.0

Dec 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

callm_py-0.1.0.tar.gz (27.2 kB view details)

Uploaded Dec 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

callm_py-0.1.0-py3-none-any.whl (43.5 kB view details)

Uploaded Dec 17, 2025 Python 3

File details

Details for the file callm_py-0.1.0.tar.gz.

File metadata

Download URL: callm_py-0.1.0.tar.gz
Upload date: Dec 17, 2025
Size: 27.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for callm_py-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`d37860476633c6b11c71565840d3b3b338016214241b537fb83bd2211b8b561a`
MD5	`207d4a1d8b0183e33208b32c4599eedd`
BLAKE2b-256	`c50179ffa31c10a9a648502d109dc21c9b829f2fd8964eaeba46c5f20cc3dc59`

See more details on using hashes here.

Provenance

The following attestation bundles were made for callm_py-0.1.0.tar.gz:

Publisher: publish.yml on milistu/callm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: callm_py-0.1.0.tar.gz
- Subject digest: d37860476633c6b11c71565840d3b3b338016214241b537fb83bd2211b8b561a
- Sigstore transparency entry: 768983435
- Sigstore integration time: Dec 17, 2025
Source repository:
- Permalink: milistu/callm@919ccb37cf1f5708d179e0a2945e9ebea7d7debe
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/milistu
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@919ccb37cf1f5708d179e0a2945e9ebea7d7debe
- Trigger Event: push

File details

Details for the file callm_py-0.1.0-py3-none-any.whl.

File metadata

Download URL: callm_py-0.1.0-py3-none-any.whl
Upload date: Dec 17, 2025
Size: 43.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for callm_py-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b372a853a16f27c850f0c6cb02dffecc5dcf5d2596c5b20e8d7833078577af34`
MD5	`02c99ba4bf15253f6471ea32d4e5c3fa`
BLAKE2b-256	`5bcdf2ba6c78dc317b650c6473f09002cd245eeaccf36edf48ad8f99eabf0181`

See more details on using hashes here.

Provenance

The following attestation bundles were made for callm_py-0.1.0-py3-none-any.whl:

Publisher: publish.yml on milistu/callm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: callm_py-0.1.0-py3-none-any.whl
- Subject digest: b372a853a16f27c850f0c6cb02dffecc5dcf5d2596c5b20e8d7833078577af34
- Sigstore transparency entry: 768983439
- Sigstore integration time: Dec 17, 2025
Source repository:
- Permalink: milistu/callm@919ccb37cf1f5708d179e0a2945e9ebea7d7debe
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/milistu
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@919ccb37cf1f5708d179e0a2945e9ebea7d7debe
- Trigger Event: push

callm-py 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

callm

😌 Why callm?

Installation

Quick Start

Features

Supported Providers

Examples

Processing Modes

Configuration

API Reference

`process_requests()`

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance