Production-grade Python rate-limiting library with Redis backend

These details have not been verified by PyPI

Project description

Rateon

Production-grade Python rate-limiting library with Redis backend, async-first design, and framework-agnostic core.

Features

Async-first - Built with async/await for minimal latency
Multiple Algorithms - Fixed window, sliding window, token bucket, leaky bucket
Framework Support - FastAPI and Starlette integrations
Redis Backend - Atomic operations with Lua scripts, cluster-safe
In-Memory Fallback - For development and testing
Observability - Prometheus metrics and structured logging
Rate Limit Headers - Automatic X-RateLimit-* headers on all responses
Security - Safe defaults, header spoofing protection, trust proxy support
Flexible - Per-endpoint, per-router, or global rate limiting

Installation

pip install rateon

Quick Start

FastAPI Middleware

from fastapi import FastAPI
from rate_limiter.integrations.fastapi import RateLimiterMiddleware
from rate_limiter.core.rules import Algorithm, RateLimitRule, Scope

app = FastAPI()

app.add_middleware(
    RateLimiterMiddleware,
    rules=[
        RateLimitRule(
            key="ip",
            limit=100,
            window=60,
            algorithm=Algorithm.SLIDING_WINDOW,
            scope=Scope.GLOBAL
        )
    ]
)

@app.get("/")
async def root():
    return {"message": "Hello World"}

FastAPI Decorator

from fastapi import FastAPI, Request
from rate_limiter.integrations.fastapi import rate_limit

app = FastAPI()

@app.get("/login")
@rate_limit("5/minute", key="ip")
async def login(request: Request):
    return {"message": "Login endpoint"}

FastAPI Dependency

from fastapi import FastAPI, APIRouter, Depends
from rate_limiter.integrations.fastapi import rate_limit_dep

app = FastAPI()
router = APIRouter(
    dependencies=[Depends(rate_limit_dep("10/minute", key="ip"))]
)

@router.get("/api/users")
async def get_users():
    return {"users": []}

app.include_router(router)

Rate Limit Headers

Both the middleware and decorator automatically add rate limit headers to all responses, allowing clients to understand their current rate limit status.

Response Headers

The following headers are added to every response:

X-RateLimit-Limit - The maximum number of requests allowed in the current window
X-RateLimit-Remaining - The number of requests remaining in the current window
X-RateLimit-Reset - Unix timestamp (seconds) when the rate limit resets
Retry-After - Number of seconds until the rate limit resets (only present on 429 responses)

Example Response Headers

Successful Response (200 OK):

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1704067200

Rate Limited Response (429 Too Many Requests):

Retry-After: 45
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1704067200

Middleware Headers

The middleware automatically adds headers to all responses:

from fastapi import FastAPI
from rate_limiter.integrations.fastapi import RateLimiterMiddleware
from rate_limiter.core.rules import RateLimitRule

app = FastAPI()

app.add_middleware(
    RateLimiterMiddleware,
    rules=[RateLimitRule(key="ip", limit=100, window=60)]
)

@app.get("/")
async def root():
    return {"message": "Hello World"}
    # Response will include X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset

Decorator Headers

The decorator also adds headers to responses:

from fastapi import FastAPI, Request
from rate_limiter.integrations.fastapi import rate_limit

app = FastAPI()

@app.get("/login")
@rate_limit("5/minute", key="ip")
async def login(request: Request):
    return {"message": "Login endpoint"}
    # Response will include X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset

Note: Headers are automatically added for both successful responses and 429 rate limit errors, providing consistent information to clients about their rate limit status.

Starlette Middleware

from starlette.applications import Starlette
from rate_limiter.integrations.starlette import RateLimiterMiddleware
from rate_limiter.core.rules import Algorithm, RateLimitRule, Scope

app = Starlette()

app.add_middleware(
    RateLimiterMiddleware,
    rules=[
        RateLimitRule(
            key="ip",
            limit=100,
            window=60,
            algorithm=Algorithm.SLIDING_WINDOW,
            scope=Scope.GLOBAL
        )
    ]
)
# All responses will include X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset headers

Global Configuration

from rate_limiter.config import RateLimiterConfig
from rate_limiter.core.limiter import RateLimiter

config = RateLimiterConfig(
    backend="redis",
    redis_url="redis://localhost:6379",
    trust_proxy_headers=True
)

limiter = RateLimiter(config=config)

Redis Cluster Support

from rate_limiter.config import RateLimiterConfig
from rate_limiter.core.limiter import RateLimiter

# Enable cluster mode
# Only one node URL is needed - client will auto-discover other nodes
config = RateLimiterConfig(
    backend="redis",
    redis_url="redis://node1:6379",  # Single node is sufficient
    redis_cluster_mode=True
)

limiter = RateLimiter(config=config)

For redundancy, you can provide multiple nodes (optional):

config = RateLimiterConfig(
    backend="redis",
    redis_url="redis://node1:6379,redis://node2:6379",  # Optional: multiple nodes for redundancy
    redis_cluster_mode=True
)

Or via environment variable:

RATE_LIMITER_REDIS_CLUSTER_MODE=true
RATE_LIMITER_REDIS_URL=redis://node1:6379

Configuration

Environment Variables

RATE_LIMITER_BACKEND=redis
RATE_LIMITER_REDIS_URL=redis://localhost:6379
RATE_LIMITER_REDIS_CLUSTER_MODE=false
RATE_LIMITER_TRUST_PROXY_HEADERS=true

Python Config

from rate_limiter.config import RateLimiterConfig

config = RateLimiterConfig(
    backend="redis",
    redis_url="redis://localhost:6379",
    default_limits=["100/minute"],
    trust_proxy_headers=True
)

Algorithms

Rateon supports four different rate limiting algorithms, each with different characteristics and use cases.

Fixed Window

How it works: Fixed window divides time into discrete, non-overlapping windows. Each window has a fixed duration (e.g., 60 seconds), and the counter resets at the start of each new window. Requests are counted within the current window, and when the limit is reached, further requests are blocked until the next window begins.

Characteristics:

Simple implementation with low overhead
Predictable behavior - limits reset at fixed intervals
Can allow bursts at window boundaries (e.g., 100 requests at 00:59 and another 100 at 01:00)
Memory efficient - only needs to track current window count

Use cases:

Simple rate limiting where occasional bursts are acceptable
High-throughput scenarios where simplicity matters
When you need predictable reset times

Example:

from rate_limiter.core.rules import Algorithm, RateLimitRule

RateLimitRule(
    limit=100,
    window=60,
    algorithm=Algorithm.FIXED_WINDOW
)
# Allows up to 100 requests per 60-second window

Sliding Window

How it works: Sliding window tracks requests within a rolling time window. Instead of fixed intervals, it maintains a continuous sliding window that moves forward with each request. Old requests outside the window are removed, and new requests are added. This provides a smooth, continuous rate limit without the boundary burst problem of fixed windows.

Characteristics:

More accurate than fixed window - no boundary bursts
Smooth rate limiting that better matches actual request patterns
Slightly more complex implementation (uses Redis sorted sets)
Better user experience - more consistent rate limiting

Use cases:

APIs where smooth rate limiting is important
Preventing boundary burst attacks
When you need more accurate rate limiting than fixed window
User-facing APIs where consistent behavior matters

Example:

from rate_limiter.core.rules import Algorithm, RateLimitRule

RateLimitRule(
    limit=100,
    window=60,
    algorithm=Algorithm.SLIDING_WINDOW
)
# Allows up to 100 requests in any 60-second period

Token Bucket

How it works: Token bucket maintains a bucket of tokens. Tokens are added to the bucket at a constant rate (refill rate). Each request consumes one token. If tokens are available, the request is allowed; otherwise, it's blocked. The bucket has a maximum capacity, allowing bursts up to that capacity while maintaining the average rate over time.

Characteristics:

Allows controlled bursts up to bucket capacity
Maintains average rate over time
Good for handling traffic spikes naturally
Tokens accumulate when traffic is low, allowing bursts when needed

Use cases:

APIs that need to handle traffic spikes gracefully
Services with variable traffic patterns
When you want to allow bursts but control average rate
Background job processing with bursty workloads

Example:

from rate_limiter.core.rules import Algorithm, RateLimitRule

RateLimitRule(
    limit=100,  # Bucket capacity (burst size)
    window=60,   # Refill rate: 100 tokens per 60 seconds
    algorithm=Algorithm.TOKEN_BUCKET
)
# Allows bursts up to 100 requests, then refills at ~1.67 requests/second

Leaky Bucket

How it works: Leaky bucket treats requests as water drops entering a bucket. The bucket has a maximum capacity, and it leaks at a constant rate. If the bucket is full, new requests (drops) overflow and are rejected. The bucket continuously leaks at the configured rate, processing requests smoothly at a constant rate regardless of input pattern.

Characteristics:

Smooth, constant-rate output
Prevents bursts entirely - enforces strict rate
Requests are processed at a steady pace
More restrictive than token bucket - no burst allowance

Use cases:

APIs that require strict, constant-rate limiting
Downstream services that can't handle bursts
When you need to smooth out traffic patterns
Rate limiting for external API calls

Example:

from rate_limiter.core.rules import Algorithm, RateLimitRule

RateLimitRule(
    limit=100,  # Bucket capacity
    window=60,   # Leak rate: 100 requests per 60 seconds
    algorithm=Algorithm.LEAKY_BUCKET
)
# Processes requests at constant rate of ~1.67 requests/second
# Rejects requests if bucket is full

Algorithm Comparison

Algorithm	Burst Handling	Accuracy	Complexity	Best For
Fixed Window	Allows boundary bursts	Low	Low	Simple use cases
Sliding Window	No bursts	High	Medium	Accurate rate limiting
Token Bucket	Controlled bursts	Medium	Medium	Variable traffic
Leaky Bucket	No bursts	High	Medium	Constant-rate output

Choosing the right algorithm:

Fixed Window: When simplicity and performance are priorities, and occasional bursts are acceptable
Sliding Window: When you need accurate, smooth rate limiting without boundary issues
Token Bucket: When you want to allow bursts but maintain average rate over time
Leaky Bucket: When you need strict, constant-rate limiting with no burst allowance

Rate Limit Rules

from rate_limiter.core.rules import Algorithm, RateLimitRule, Scope

rule = RateLimitRule(
    key="ip",                      # Identity key: "ip", "user_id", or custom
    limit=100,                     # Maximum requests
    window=60,                     # Time window in seconds
    algorithm=Algorithm.SLIDING_WINDOW,  # Algorithm to use
    scope=Scope.ENDPOINT           # Scope: ENDPOINT, ROUTER, or GLOBAL
)

Available Algorithms

Algorithm.FIXED_WINDOW - Fixed window algorithm
Algorithm.SLIDING_WINDOW - Sliding window algorithm
Algorithm.TOKEN_BUCKET - Token bucket algorithm
Algorithm.LEAKY_BUCKET - Leaky bucket algorithm

Available Scopes

Scope.ENDPOINT - Apply to individual endpoints
Scope.ROUTER - Apply to all endpoints in a router
Scope.GLOBAL - Apply globally to all endpoints

Identity Resolution

Rate limiting can be based on:

IP Address - key="ip"
User ID - key="user_id" (requires identity resolver)
API Key - key="api_key" (requires identity resolver)
Custom - Provide your own resolver function

Using Built-in Identity Resolvers

The library provides built-in identity resolvers that you can use directly:

IP Address Resolver (default):

from rate_limiter.core.identity import IPIdentityResolver
from rate_limiter.integrations.fastapi import RateLimiterMiddleware

# Default IP resolver
ip_resolver = IPIdentityResolver(trust_proxy=False)

# With proxy support (for behind load balancers)
ip_resolver = IPIdentityResolver(trust_proxy=True)

app.add_middleware(
    RateLimiterMiddleware,
    rules=[...],
    identity_resolver=ip_resolver
)

User ID Resolver:

from rate_limiter.core.identity import UserIdentityResolver
from rate_limiter.integrations.fastapi import rate_limit

# Default user resolver (tries common patterns: request.user, request.state.user, JWT)
user_resolver = UserIdentityResolver()

@app.get("/profile")
@rate_limit("50/hour", key="user_id", identity_resolver=user_resolver)
async def get_profile(request: Request):
    return {"profile": "data"}

API Key Resolver:

from rate_limiter.core.identity import APIKeyIdentityResolver
from rate_limiter.integrations.fastapi import rate_limit

# Default: reads from X-API-Key header
api_key_resolver = APIKeyIdentityResolver()

# Custom header name
api_key_resolver = APIKeyIdentityResolver(header_name="X-Custom-Key")

@app.get("/api/data")
@rate_limit("100/hour", key="api_key", identity_resolver=api_key_resolver)
async def get_data(request: Request):
    return {"data": "protected"}

Note: When using key="ip", key="user_id", or key="api_key" without providing an identity_resolver, the library automatically uses the appropriate built-in resolver. You only need to pass a custom resolver if you want to customize the behavior.

Where to use IdentityResolver:

You can pass an identity_resolver parameter to:

Middleware - RateLimiterMiddleware(identity_resolver=...)
Decorator - @rate_limit(..., identity_resolver=...)
Dependency - rate_limit_dep(..., identity_resolver=...)

Using IdentityResolver with Dependency:

from fastapi import FastAPI, APIRouter, Depends
from rate_limiter.core.identity import UserIdentityResolver
from rate_limiter.integrations.fastapi import rate_limit_dep

app = FastAPI()

# Create a custom resolver
custom_resolver = UserIdentityResolver()

# Use with router dependency
router = APIRouter(
    dependencies=[Depends(rate_limit_dep("10/minute", key="user_id", identity_resolver=custom_resolver))]
)

@router.get("/api/users")
async def get_users():
    return {"users": []}

Custom Identity Resolver

You can create a custom identity resolver by implementing the IdentityResolver protocol. This allows you to extract identity from any source (headers, cookies, request body, etc.).

Example: Custom resolver based on a custom header

from typing import Any
from fastapi import FastAPI, Request
from rate_limiter.integrations.fastapi import RateLimiterMiddleware, rate_limit
from rate_limiter.core.rules import Algorithm, RateLimitRule

app = FastAPI()

class CustomHeaderIdentityResolver:
    """Custom resolver that extracts identity from X-Client-ID header."""
    
    async def resolve(self, request: Any) -> str:
        """Extract client ID from custom header."""
        if hasattr(request, "headers"):
            client_id = request.headers.get("X-Client-ID")
            if client_id:
                return client_id
        return "unknown"

# Use with middleware
custom_resolver = CustomHeaderIdentityResolver()
app.add_middleware(
    RateLimiterMiddleware,
    rules=[RateLimitRule(limit=100, window=60, key="custom", algorithm=Algorithm.FIXED_WINDOW)],
    identity_resolver=custom_resolver
)

# Use with decorator
@app.get("/api/data")
@rate_limit("10/minute", key="custom", identity_resolver=custom_resolver)
async def get_data(request: Request):
    return {"data": "protected"}

Example: Custom resolver using UserIdentityResolver with custom extractor

from fastapi import FastAPI, Request
from rate_limiter.core.identity import UserIdentityResolver
from rate_limiter.integrations.fastapi import rate_limit

app = FastAPI()

def extract_user_id(request):
    """Custom function to extract user ID from request."""
    # Example: Extract from JWT token in Authorization header
    auth_header = request.headers.get("Authorization", "")
    if auth_header.startswith("Bearer "):
        token = auth_header.split(" ")[1]
        # Decode JWT and extract user ID (simplified example)
        # In production, use proper JWT library
        return f"user_{hash(token) % 10000}"
    return "anonymous"

# Create resolver with custom extractor
custom_user_resolver = UserIdentityResolver(user_id_extractor=extract_user_id)

@app.get("/profile")
@rate_limit("50/hour", key="user_id", identity_resolver=custom_user_resolver)
async def get_profile(request: Request):
    return {"profile": "data"}

Example: Composite resolver (multiple identity sources)

from fastapi import FastAPI
from rate_limiter.core.identity import CompositeIdentityResolver, IPIdentityResolver, UserIdentityResolver
from rate_limiter.integrations.fastapi import RateLimiterMiddleware
from rate_limiter.core.rules import Algorithm, RateLimitRule

app = FastAPI()

# Combine IP and User ID for more granular rate limiting
composite_resolver = CompositeIdentityResolver([
    ("ip", IPIdentityResolver(trust_proxy=True)),
    ("user", UserIdentityResolver())
])

# This will create keys like "ip:192.168.1.1:user:12345"
app.add_middleware(
    RateLimiterMiddleware,
    rules=[RateLimitRule(limit=100, window=60, key="composite", algorithm=Algorithm.SLIDING_WINDOW)],
    identity_resolver=composite_resolver
)

Observability

Prometheus Metrics

The library automatically exposes Prometheus metrics:

rate_limiter_requests_total - Total requests (labels: rule_key, status)
rate_limiter_requests_allowed - Allowed requests
rate_limiter_requests_blocked - Blocked requests

Structured Logging

import logging
from rate_limiter.core.limiter import RateLimiter

logger = logging.getLogger("rate_limiter")
# Configure your logging handler

Redis Cluster Support

The library supports both standalone Redis and Redis Cluster:

Standalone Mode (default): Single Redis instance
Cluster Mode: Redis Cluster with automatic node discovery
- Only one node URL is required - the client automatically discovers all other nodes
- Multiple node URLs can be provided for redundancy (optional)
- Keys use hash tags to ensure Lua scripts work correctly
- Automatic failover and slot migration handling

Important:

In cluster mode, only one node URL is needed. The Redis client will automatically discover the entire cluster topology.
All keys in Lua scripts must be in the same hash slot. The library automatically uses hash tags ({...}) to ensure this.

Security Considerations

Safe Redis Keys - Keys are sanitized and prefixed
Header Spoofing Protection - Only trusted proxies are used for IP resolution
Fail Closed - On backend failure, requests are denied by default
Constant-Time Comparison - Prevents timing attacks

Development

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Type checking
mypy rate_limiter

# Format code
black rate_limiter tests

# Lint
ruff check rate_limiter tests

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.3

Feb 9, 2026

0.1.2

Feb 7, 2026

0.1.1

Feb 7, 2026

This version

0.1.0

Feb 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rateon-0.1.0.tar.gz (26.4 kB view details)

Uploaded Feb 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rateon-0.1.0-py3-none-any.whl (34.3 kB view details)

Uploaded Feb 7, 2026 Python 3

File details

Details for the file rateon-0.1.0.tar.gz.

File metadata

Download URL: rateon-0.1.0.tar.gz
Upload date: Feb 7, 2026
Size: 26.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.1 CPython/3.14.2 Darwin/25.2.0

File hashes

Hashes for rateon-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`1fdd0b3ab281154491e32fab46da552eb6f7dc64acf46ec3e5d6d1b492b5f83a`
MD5	`427ce47fe20b4568dd453ae82c3c966e`
BLAKE2b-256	`fd84f0cc779d6e4f46d8dcfe4e0ad78ceae406da242da4c2b5c703aad5a42f7c`

See more details on using hashes here.

File details

Details for the file rateon-0.1.0-py3-none-any.whl.

File metadata

Download URL: rateon-0.1.0-py3-none-any.whl
Upload date: Feb 7, 2026
Size: 34.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.1 CPython/3.14.2 Darwin/25.2.0

File hashes

Hashes for rateon-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dbf7e1a7c6b6ba8bad73ede0a0e61f4c45506d829787e1e6675c3a42fe719810`
MD5	`f608bb4a552740417a4fdc59432acfbd`
BLAKE2b-256	`c19bcfa5f457f4c352f09d0ee58648c743d0c44eedf4595a50247b85b6635b8f`

See more details on using hashes here.

rateon 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Rateon

Features

Installation

Quick Start

FastAPI Middleware

FastAPI Decorator

FastAPI Dependency

Rate Limit Headers

Response Headers

Example Response Headers

Middleware Headers

Decorator Headers

Starlette Middleware

Global Configuration

Redis Cluster Support

Configuration

Environment Variables

Python Config

Algorithms

Fixed Window

Sliding Window

Token Bucket

Leaky Bucket

Algorithm Comparison

Rate Limit Rules

Available Algorithms

Available Scopes

Identity Resolution

Using Built-in Identity Resolvers

Custom Identity Resolver

Observability

Prometheus Metrics

Structured Logging

Redis Cluster Support

Security Considerations

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes