Multi-provider rate limiter for LLM API pipelines

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

ratelimiter

Multi-provider rate limiter for LLM API pipelines. Drop-in module that keeps you out of 429 Too Many Requests trouble across all the providers you use.

Per (provider, model) limits — RPM, TPM, RPD with sliding window
Auto 429 backoff — exponential with jitter, reads Retry-After when present
Thread-safe + async-safe — same RateLimiter for both worlds
YAML config — plans.yaml with glob wildcards, easy to edit
Zero hard dependencies — only pyyaml
Empirically tested — 300 RPM on tokenrouter/MiniMax-M3 verified via burst test

Install

The PyPI package is published as agamenox-ratelimiter (the plain ratelimiter name is taken on PyPI since 2013). The Python import stays as ratelimiter — different names, same module.

pip install agamenox-ratelimiter
# or, from source:
git clone https://github.com/Agamenox/ratelimiter.git
cd ratelimiter
pip install -e .

Quick start

from ratelimiter import RateLimiter, call_with_retry

limiter = RateLimiter.from_yaml("plans.yaml")

# Wrap any function — auto 429 retry
def call_m3(prompt: str) -> str:
    return call_with_retry(
        limiter, "tokenrouter", "MiniMax-M3",
        api_call_fn, prompt,
        max_retries=5, base_backoff=1.0, max_backoff=60.0,
    )

# Or acquire manually
limiter.acquire("tokenrouter", "MiniMax-M3")
response = api_call(...)

# Monitor usage
print(limiter.status("tokenrouter", "MiniMax-M3"))
# {'rpm_used': 5, 'rpm_limit': 300, 'tpm_used': 0, ...}

Async:

ok = await limiter.acquire_async("openrouter", "minimax/MiniMax-M2.5-highspeed")

Why this exists

I was running a batch pipeline through tokenrouter's free MiniMax-M3 model and getting mysterious 429s. After a sustained-rate test I learned the limit was 300 requests/minute, no rate limit headers, sliding 60s window — and that the error only told me anything after I hit the wall.

So I built a small limiter, measured more providers, and packaged it up. Now my pipelines throttle themselves before the wall, and when they do hit it they back off cleanly with exponential jitter.

The key insight: different providers have different limits, different header conventions, and different retry semantics. Hardcoding any of it is a maintenance trap. Hence the YAML registry — you measure once, write it down, and the limiter does the right thing for every provider.

Plans (registry format)

tokenrouter:
  MiniMax-M3:
    rpm: 300
    tier: free
    notes: "Empirically 300 RPM, no headers, sliding window."

openrouter:
  "*:free":
    rpm: 20
    rpd: 200
    tier: free

Fields: rpm (required), tpm, rpd, burst, tier (free|paid|enterprise|local), notes. Wildcards (*, ?) supported in the model field.

Detected limits (2026-06-14)

Provider / Model	RPM	TPM	RPD	Source
`tokenrouter/MiniMax-M3` (free)	300	—	—	Empirical burst test
`openrouter/*:free`	20	—	200	OpenRouter docs
`nvidia/*` (NIM)	40	800K	—	Conservative default
`zai/glm-5-turbo`	10	500K	—	User report
`minimax/M2.5-highspeed`	60	1M	—	Conservative
`opencode-go/*`	60	500K	—	Conservative
`lmstudio/*`	∞	—	—	Local

Naming note: PyPI package is agamenox-ratelimiter; Python import is ratelimiter (the module directory name). Use pip install agamenox-ratelimiter to install; from ratelimiter import ... to use.

If you've measured a different limit, open a rate-limit data issue so the registry stays honest.

API

lim = RateLimiter.from_yaml("plans.yaml")      # or from_dict({...})

# Sync
ok = lim.acquire(provider, model, estimated_tokens=0, timeout=300)
lim.release(provider, model, estimated_tokens=0)        # refund a slot
status = lim.status(provider, model)                    # snapshot dict

# Async
ok = await lim.acquire_async(provider, model, estimated_tokens=0)

# Auto-429 wrapper
result = call_with_retry(lim, provider, model, fn, *args,
                         max_retries=5, base_backoff=1.0, max_backoff=60.0,
                         estimated_tokens_fn=None)

Detects 429s from urllib, requests, httpx, and any object with .status_code == 429. Reads Retry-After and X-RateLimit-* headers when present.

Documentation

API reference — full method signatures, algorithm notes
Tokenrouter specifics — empirical test methodology
Contributing
Changelog

Tests

python tests/test_limiter.py        # 20 unit tests, < 0.5s, no network
python examples/integration_test.py # 5 real API calls to tokenrouter

The unit tests use a FakeClock so the time-dependent ones run in milliseconds. The integration test requires pyyaml and a tokenrouter API key (loaded from F:\dev\ratelimiter\examples\integration_test.py config).

CI

GitHub Actions runs on every push and PR:

Unit tests on Python 3.9–3.13, Ubuntu + Windows + macOS
Lint + syntax check on every .py file
YAML round-trip — verify plans.yaml is loadable
CodeQL — security analysis, weekly schedule
Publish to PyPI — auto-triggered on GitHub release (trusted publishing)

Roadmap

Per-key / per-project quotas (multi-tenant)
Prometheus metrics export
Redis backend for distributed pipelines
Async context manager: async with limiter.guard(...) as ok:

License

MIT — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

eligodoy

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agamenox_ratelimiter-0.1.0.tar.gz (13.8 kB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agamenox_ratelimiter-0.1.0-py3-none-any.whl (13.1 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file agamenox_ratelimiter-0.1.0.tar.gz.

File metadata

Download URL: agamenox_ratelimiter-0.1.0.tar.gz
Upload date: Jun 16, 2026
Size: 13.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agamenox_ratelimiter-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`1773b63de7e4b9480a33134ba8378df30ab8bddf4b3ef51483fef4c785df642c`
MD5	`8bedc8b2e04478724ab2dec0f8bf14b1`
BLAKE2b-256	`cabe65d05af445b832a2cacfd2aa2245370403fd9bf08b2f4ec02af80fdd9966`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agamenox_ratelimiter-0.1.0.tar.gz:

Publisher: publish.yml on Agamenox/ratelimiter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agamenox_ratelimiter-0.1.0.tar.gz
- Subject digest: 1773b63de7e4b9480a33134ba8378df30ab8bddf4b3ef51483fef4c785df642c
- Sigstore transparency entry: 1833772497
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: Agamenox/ratelimiter@3eb8b1adbc648e9acb2f2c3d8e6a20f99ee50adc
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Agamenox
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@3eb8b1adbc648e9acb2f2c3d8e6a20f99ee50adc
- Trigger Event: release

File details

Details for the file agamenox_ratelimiter-0.1.0-py3-none-any.whl.

File metadata

Download URL: agamenox_ratelimiter-0.1.0-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 13.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agamenox_ratelimiter-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bf5d278b4f821c1ee8cb1acb965f7118262e3709978ed4b18c3e1b2878711a53`
MD5	`b29f826b9c3f3026588872f18e5bdd64`
BLAKE2b-256	`07c966be5a7cac7d724d599df22855f1d9d39e388c3cb30bc348aedd2d4c7bf9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agamenox_ratelimiter-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Agamenox/ratelimiter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agamenox_ratelimiter-0.1.0-py3-none-any.whl
- Subject digest: bf5d278b4f821c1ee8cb1acb965f7118262e3709978ed4b18c3e1b2878711a53
- Sigstore transparency entry: 1833772722
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: Agamenox/ratelimiter@3eb8b1adbc648e9acb2f2c3d8e6a20f99ee50adc
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Agamenox
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@3eb8b1adbc648e9acb2f2c3d8e6a20f99ee50adc
- Trigger Event: release

agamenox-ratelimiter 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

ratelimiter

Install

Quick start

Why this exists

Plans (registry format)

Detected limits (2026-06-14)

API

Documentation

Tests

CI

Roadmap

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance