Async-native rate limiting for Python — token bucket, fixed window, and sliding window algorithms with in-memory and Redis backends

These details have not been verified by PyPI

Project links

Project description

rate-limiter

A pluggable, async-native rate-limiting library for Python. Supports token bucket, fixed window, and sliding window log algorithms across in-memory and Redis backends, with a drop-in ASGI middleware for FastAPI/Starlette.

Features

3 algorithms — fixed window, sliding window log, token bucket — selectable at construction time
2 backends — in-memory (single-process, ~70k ops/s) and Redis (distributed, ~27k ops/s via atomic Lua scripts)
ASGI middleware — drop into any FastAPI/Starlette app with X-RateLimit-* headers, 429 responses, and Retry-After
Async-native — built on asyncio and redis-py async; no blocking calls
Bring-your-own Redis — the caller owns the redis.asyncio.Redis instance and its lifecycle
Configurable failure policy — fail-open (favour availability) or fail-closed (favour safety)
Fully typed — Pydantic models, mypy-clean

Algorithm trade-offs

Algorithm	Accuracy	Redis memory	Redis ops	Best for
Fixed window	Can allow 2x burst at window boundaries	O(1) — 64 B/key	`INCR` + `EXPIRE`	Simple rate caps where boundary bursts are acceptable
Sliding window log	Exact — no boundary bursts	O(N) — grows with `max_requests`	`ZADD` + `ZREMRANGEBYSCORE` + `ZCARD`	Strict per-client fairness
Token bucket	Smooth — allows controlled bursts up to bucket capacity	O(1) — 141 B/key	`HGET` + `HSET` + `TIME`	APIs that want to allow short bursts while enforcing an average rate

Quick start

Installation

pip install async-ratelimit

In-memory rate limiting

import asyncio
from rate_limiter import RateLimiterOrchestrator, AlgorithmType, RateLimiterType

limiter = RateLimiterOrchestrator(
    rate_limiter_type=RateLimiterType.IN_MEMORY,
    algorithm_type=AlgorithmType.TOKEN_BUCKET,
    max_requests=100,       # 100 requests
    time_window=60,         # per 60-second window
)

async def main():
    response = await limiter.get_response(uId="user-123")
    print(response.allowed)             # True
    print(response.remaining_requests)  # 99
    print(response.reset_time)          # seconds until the bucket refills

asyncio.run(main())

Distributed rate limiting with Redis

from redis.asyncio import Redis

redis_client = Redis(host="localhost", port=6379, decode_responses=True)

limiter = RateLimiterOrchestrator(
    rate_limiter_type=RateLimiterType.REDIS,
    algorithm_type=AlgorithmType.SLIDING_WINDOW,
    max_requests=1000,
    time_window=3600,
    redis_client=redis_client,      # bring your own client
)

# Use it the same way — Redis backend is transparent
response = await limiter.get_response(uId="user-123")

# Clean up when done
await redis_client.aclose()

FastAPI middleware

from fastapi import FastAPI
from redis.asyncio import Redis
from rate_limiter import RateLimiterOrchestrator, RateLimiterMiddleware, AlgorithmType, RateLimiterType

app = FastAPI()

redis_client = Redis(host="localhost", port=6379, decode_responses=True)
limiter = RateLimiterOrchestrator(
    rate_limiter_type=RateLimiterType.REDIS,
    algorithm_type=AlgorithmType.TOKEN_BUCKET,
    max_requests=10,
    time_window=60,
    redis_client=redis_client,
)

app.add_middleware(
    RateLimiterMiddleware,
    limiter=limiter,
    key_func=lambda r: r.headers.get("X-API-Key") or r.client.host,
    exclude_routes=["/health", "/docs"],
    fail_open=True,     # let requests through if the limiter errors
)

@app.get("/ping")
async def ping():
    return {"message": "pong"}

Every response includes standard rate-limit headers:

X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
X-RateLimit-Reset: 45

When the limit is exceeded, the middleware returns 429 Too Many Requests with a Retry-After header — the downstream handler is never invoked.

Architecture

┌──────────────────────────────────────────────────────────────────┐
│                     RateLimiterOrchestrator                      │
│  (public API — validates config, delegates to backend)           │
├──────────────┬───────────────────────────────────────────────────┤
│              │                                                   │
│  ┌───────────▼──────────┐       ┌───────────────────────────┐   │
│  │  InMemoryRateLimiter │       │    RedisRateLimiter       │   │
│  │  (asyncio.Lock)      │       │    (BYO redis client)     │   │
│  └───────────┬──────────┘       └───────────┬───────────────┘   │
│              │                              │                    │
│              ▼                              ▼                    │
│     AlgorithmFactory.create()       AlgorithmFactory.create()    │
│              │                              │                    │
│   ┌──────────┼──────────┐       ┌───────────┼──────────┐        │
│   │          │          │       │           │          │         │
│   ▼          ▼          ▼       ▼           ▼          ▼         │
│ FixedW   SlidingW   TokenB   FixedW    SlidingW    TokenB       │
│ InMem    InMem      InMem    InRedis   InRedis     InRedis      │
│ (dict)   (dict)     (dict)   (Lua)     (Lua)       (Lua)        │
└──────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────┐
│                    RateLimiterMiddleware                          │
│  (ASGI — key_func, exclude_routes, fail_open/closed)             │
│  Wraps RateLimiterOrchestrator, adds X-RateLimit-* headers       │
└──────────────────────────────────────────────────────────────────┘

Design decisions:

Strategy + Factory pattern — algorithms are interchangeable behind a common Algorithm ABC. The factory dispatches on (backend, algorithm) to the correct implementation.
Bring-your-own Redis — the library never creates or configures a Redis connection. The caller passes in a redis.asyncio.Redis instance and owns its lifecycle (aclose()). This avoids env-var coupling and lets the caller control connection pooling, TLS, sentinels, etc.
Atomic Lua scripts — all Redis algorithms use EVAL/EVALSHA to run their logic server-side in a single atomic operation. No distributed locks needed.

Redis Lua scripts

Each Redis algorithm runs as a single Lua script executed atomically by Redis:

Algorithm	Script	Operations	Atomicity guarantee
Fixed window	`INCR` → conditional `EXPIRE` → `TTL`	Increment counter, set TTL on first request	Counter + expiry can never diverge
Sliding window	`ZREMRANGEBYSCORE` → `ZCARD` → `ZADD` → `EXPIRE`	Prune expired entries, count, add if under limit	No request can slip between prune and count
Token bucket	`TIME` → `HGET` → refill math → `HSET` → `EXPIRE`	Server-authoritative clock, lazy token refill	Refill + consume is one atomic step

Benchmarks

Microbenchmark results on Apple Silicon (M-series), Python 3.14.2, Redis 7 on localhost. Full methodology and commands in BENCHMARKS.md.

Throughput (ops/s) — hot key, 1 client

Backend	Algorithm	Conc=1	Conc=10	Conc=50	Conc=100	Conc=250
in_memory	fixed_window	68,404	69,823	67,906	66,705	59,387
in_memory	sliding_window	57,341	59,153	60,474	60,604	58,528
in_memory	token_bucket	70,934	69,237	71,112	69,173	69,721
redis	fixed_window	8,533	25,317	29,456	27,889	27,315
redis	sliding_window	7,631	21,407	24,718	25,011	23,390
redis	token_bucket	7,721	23,101	26,864	27,676	25,393

Latency at peak load (concurrency=250)

Backend	Algorithm	p50	p99	p99.9
in_memory	fixed_window	7.2 µs	24.2 µs	85.5 µs
in_memory	sliding_window	8.8 µs	9.4 µs	14.6 µs
in_memory	token_bucket	6.7 µs	7.1 µs	11.5 µs
redis	fixed_window	6.7 ms	25.0 ms	58.6 ms
redis	sliding_window	7.9 ms	27.7 ms	50.5 ms
redis	token_bucket	7.2 ms	28.1 ms	53.7 ms

Redis memory per client key

Algorithm	Memory	Data structure
Fixed window	64 B	String (counter) — O(1)
Token bucket	141 B	Hash (3 fields) — O(1)
Sliding window	6.4 MB	Sorted set (1 member/request) — O(N)

Key observations

In-memory throughput is ~8x higher than Redis (~70k vs ~8k ops/s at concurrency=1) because Redis operations pay a network round-trip (~100 µs) even on localhost.
Redis throughput scales 3x with concurrency (8k → 27k ops/s from c=1 → c=50) because asyncio coroutines overlap I/O waits. It plateaus as Redis's single-threaded command processing saturates.
In-memory throughput is flat across concurrency because asyncio is single-threaded — all coroutines serialize through asyncio.Lock.
Sliding window is the most memory-expensive at O(N) per client. With 50k max_requests, a single key uses 6.4 MB. Fixed window and token bucket use O(1) space regardless of request volume.

For the full benchmark suite, methodology, multi-client results, and chart generation, see BENCHMARKS.md.

Testing

# Run all tests (Redis required — via local instance, testcontainers, or REDIS_HOST env)
pytest -v

# In-memory tests only (no Redis needed)
pytest -v tests/test_fixed_window.py tests/test_sliding_window.py tests/test_token_bucket.py

# With Docker Compose (spins up Redis automatically)
docker compose up --build

The test suite includes:

Behavioral tests — allow/deny, window expiry, token refill, weight decay, reset
Redis integration tests — same behavioral coverage against a live Redis with Lua scripts
Concurrency tests — 50-way asyncio.gather across all algorithms
Middleware tests — 429 responses, X-RateLimit-* headers, Retry-After, fail-open/closed
Edge-case tests — zero/negative config validation

55 tests total, all passing in CI with a Redis service container.

Development

# Clone and install dev dependencies
git clone https://github.com/freq31/rate_limiter.git
cd rate_limiter
pip install -r requirements.txt

# Lint and type-check
ruff check .
black --check rate_limiter tests
mypy rate_limiter

# Run benchmarks
python -m benchmarks.run --ops 50000 --concurrency 1,10,50,100,250

# Generate charts
python -m benchmarks.plot benchmarks/results/<csv_file>.csv

Project structure

rate_limiter/
├── __init__.py                 # Public API re-exports
├── main.py                     # RateLimiterOrchestrator (public API) + Factory
├── app.py                      # FastAPI demo application
├── settings.py                 # Pydantic settings for the demo app
├── algorithms/
│   ├── base.py                 # Algorithm ABC + AlgorithmFactory
│   ├── fixed_window.py         # FixedWindowInMemory, FixedWindowInRedis
│   ├── sliding_window.py       # SlidingWindowInMemory, SlidingWindowInRedis
│   └── token_bucket.py         # TokenBucketInMemory, TokenBucketInRedis
├── backend/
│   ├── base.py                 # RateLimiter ABC
│   ├── memory.py               # InMemoryRateLimiter
│   ├── redis.py                # RedisRateLimiter (BYO client)
│   ├── middleware.py           # RateLimiterMiddleware (ASGI)
│   ├── request.py              # Rules, Client, AlgorithmType, RateLimiterType
│   └── response.py             # Response model
└── scripts/
    ├── fixed_window.py         # Lua: INCR + EXPIRE
    ├── sliding_window.py       # Lua: sorted set log
    └── token_bucket.py         # Lua: hash + TIME + refill math
tests/                          # 55 tests — behavioral, integration, concurrency, middleware
benchmarks/                     # Async microbenchmark harness + chart generation

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jun 29, 2026

0.1.0

Jun 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

async_ratelimit-0.1.1.tar.gz (459.8 kB view details)

Uploaded Jun 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

async_ratelimit-0.1.1-py3-none-any.whl (21.7 kB view details)

Uploaded Jun 29, 2026 Python 3

File details

Details for the file async_ratelimit-0.1.1.tar.gz.

File metadata

Download URL: async_ratelimit-0.1.1.tar.gz
Upload date: Jun 29, 2026
Size: 459.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for async_ratelimit-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`bafb20d81b1bac7bb110ad627700278efc7ae65d449246311c0687d4be8450ef`
MD5	`15cf2d65e88db3a667f36e9d32178d6f`
BLAKE2b-256	`7f857e6109e096fff2dc65dd8da0d7c7eea687b811f756586e20194c24cfbe8d`

See more details on using hashes here.

File details

Details for the file async_ratelimit-0.1.1-py3-none-any.whl.

File metadata

Download URL: async_ratelimit-0.1.1-py3-none-any.whl
Upload date: Jun 29, 2026
Size: 21.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for async_ratelimit-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`77dbf560f7221e5422912e4251c1820dc42d39e9e4488731accc0a259fdbbb06`
MD5	`c4f388596d6d8945036d13ac6d67a470`
BLAKE2b-256	`9254add874cf545c3f08af0ddf59bdf4c0a9651383f7b233b82d09a0e4d770dd`

See more details on using hashes here.

async-ratelimit 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

rate-limiter

Features

Algorithm trade-offs

Quick start

Installation

In-memory rate limiting

Distributed rate limiting with Redis

FastAPI middleware

Architecture

Redis Lua scripts

Benchmarks

Throughput (ops/s) — hot key, 1 client

Latency at peak load (concurrency=250)

Redis memory per client key

Key observations

Testing

Development

Project structure

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes