Drop-in middleware that enforces consistent rate limits with safe defaults and Redis-backed accuracy

These details have not been verified by PyPI

Project links

Project description

Halt Python SDK

Drop-in middleware that enforces consistent rate limits per IP/user/api-key with safe defaults, Redis-backed accuracy, and clean headers.

Features

🚀 Multiple Algorithms

Token Bucket (burst-friendly, recommended)
Fixed Window (simple, fast)
Sliding Window (accurate, memory-intensive)

🔑 Flexible Key Strategies

Per-IP address
Per-authenticated user
Per-API key
Composite keys (e.g., user:ip)
Custom key extraction

💾 Storage Options

In-memory (development)
Redis (production, coming soon)

🎯 Framework Support

FastAPI / Starlette
Flask
Django

📊 Standard Headers

RateLimit-Limit
RateLimit-Remaining
RateLimit-Reset
Retry-After (on 429)

⚡ Smart Features

Automatic health check exemptions
Private IP exemptions
Custom exemption lists
Weighted endpoints (cost-based)
Burst handling

Installation

pip install halt

Optional Dependencies

# Redis support (coming soon)
pip install halt[redis]

# Framework-specific
pip install halt[fastapi]
pip install halt[flask]
pip install halt[django]

# Development
pip install halt[dev]

Quick Start

FastAPI

from fastapi import FastAPI
from halt import RateLimiter, InMemoryStore, presets
from halt.adapters.fastapi import HaltMiddleware

app = FastAPI()

# Create rate limiter
limiter = RateLimiter(
    store=InMemoryStore(),
    policy=presets.PUBLIC_API  # 100 req/min
)

# Add middleware
app.add_middleware(HaltMiddleware, limiter=limiter)

@app.get("/")
async def root():
    return {"message": "Hello World"}

Flask

from flask import Flask
from halt import RateLimiter, InMemoryStore, presets
from halt.adapters.flask import HaltFlask

app = Flask(__name__)

limiter = RateLimiter(
    store=InMemoryStore(),
    policy=presets.PUBLIC_API
)

HaltFlask(app, limiter=limiter)

@app.route("/")
def root():
    return {"message": "Hello World"}

Django

# settings.py
from halt import RateLimiter, InMemoryStore, presets
from halt.adapters.django import create_halt_middleware

limiter = RateLimiter(
    store=InMemoryStore(),
    policy=presets.PUBLIC_API
)

HaltMiddleware = create_halt_middleware(limiter)

MIDDLEWARE = [
    # ... other middleware
    'myapp.middleware.HaltMiddleware',
]

Preset Policies

Halt comes with battle-tested presets:

from halt import presets

# Public API - moderate limits
presets.PUBLIC_API
# 100 requests/minute, burst: 120

# Authentication endpoints - strict
presets.AUTH_ENDPOINTS
# 5 requests/minute, burst: 10, 5min cooldown

# Expensive operations - very strict
presets.EXPENSIVE_OPS
# 10 requests/hour, burst: 15, cost: 10

# Strict API - for sensitive ops
presets.STRICT_API
# 20 requests/minute, burst: 25

# Generous API - for internal services
presets.GENEROUS_API
# 1000 requests/minute, burst: 1200

SaaS Features

Plan-Based Rate Limiting

from halt import presets

# Use plan-based presets
PLAN_FREE = presets.PLAN_FREE          # 100 req/hour
PLAN_STARTER = presets.PLAN_STARTER    # 500 req/hour
PLAN_PRO = presets.PLAN_PRO            # 2000 req/hour
PLAN_BUSINESS = presets.PLAN_BUSINESS  # 5000 req/hour
PLAN_ENTERPRISE = presets.PLAN_ENTERPRISE  # 20000 req/hour

# Get policy by plan name
policy = presets.get_plan_policy("pro")

# Dynamic policy resolution
def get_user_policy(request):
    user = get_current_user(request)
    return presets.get_plan_policy(user.plan)

limiter = RateLimiter(
    store=store,
    policy=get_user_policy(request)
)

Quota Management

from halt.core.quota import QuotaManager, Quota, QuotaPeriod

# Initialize quota manager
quota_manager = QuotaManager(store)

# Define quotas
monthly_quota = Quota(
    name="api_calls",
    limit=100000,
    period=QuotaPeriod.MONTHLY
)

# Check quota
allowed, current_quota = quota_manager.check_quota(
    identifier="user_123",
    quota=monthly_quota
)

if allowed:
    # Consume quota
    quota_manager.consume_quota("user_123", monthly_quota, cost=1)
else:
    # Quota exceeded
    print(f"Quota exceeded. Resets at: {current_quota.reset_at}")

Penalty System

from halt.core.penalty import PenaltyManager, PenaltyConfig

# Initialize penalty manager
penalty_manager = PenaltyManager(
    store=store,
    config=PenaltyConfig(
        threshold=10,      # Abuse score threshold
        duration=3600,     # 1 hour penalty
        multiplier=0.5,    # Reduce limit to 50%
        decay_rate=1.0     # 1 point/hour decay
    )
)

# Record violation
penalty = penalty_manager.record_violation(
    identifier="user_123",
    severity=1.0
)

# Check penalty status
if penalty.is_active():
    print(f"User penalized until: {penalty.penalty_until}")
    print(f"Abuse score: {penalty.abuse_score}")

Telemetry & Observability

from halt.core.telemetry import LoggingTelemetry, MetricsTelemetry
import logging

# Logging telemetry
logger = logging.getLogger(__name__)
telemetry = LoggingTelemetry(logger)

# Metrics telemetry (Prometheus, StatsD, etc.)
from prometheus_client import Counter, Gauge

class PrometheusTelemetry:
    def __init__(self):
        self.checks = Counter('halt_checks_total', 'Total rate limit checks')
        self.blocked = Counter('halt_blocked_total', 'Total blocked requests')
        self.remaining = Gauge('halt_remaining', 'Remaining requests')
    
    def on_check(self, key, decision, metadata=None):
        self.checks.inc()
    
    def on_blocked(self, key, decision, metadata=None):
        self.blocked.inc()
    
    def on_allowed(self, key, decision, metadata=None):
        self.remaining.set(decision.remaining)

# Use with limiter
limiter = RateLimiter(
    store=store,
    policy=policy,
    telemetry=PrometheusTelemetry()
)

Custom Policies

Basic Custom Policy

from halt import Policy, KeyStrategy, Algorithm

custom_policy = Policy(
    name="custom",
    limit=50,
    window=60,  # 1 minute
    burst=60,
    algorithm=Algorithm.TOKEN_BUCKET,
    key_strategy=KeyStrategy.IP,
)

Advanced Examples

Rate Limit by User

user_policy = Policy(
    name="per_user",
    limit=100,
    window=3600,  # 1 hour
    key_strategy=KeyStrategy.USER,
)

Rate Limit by API Key

api_policy = Policy(
    name="per_api_key",
    limit=1000,
    window=60,
    key_strategy=KeyStrategy.API_KEY,
)

Composite Keys (User + IP)

composite_policy = Policy(
    name="user_and_ip",
    limit=50,
    window=60,
    key_strategy=KeyStrategy.COMPOSITE,
)

Weighted Endpoints

expensive_policy = Policy(
    name="llm_endpoint",
    limit=100,
    window=3600,
    cost=10,  # Each request costs 10 tokens
    algorithm=Algorithm.TOKEN_BUCKET,
)

Algorithms

Token Bucket (Recommended)

Best for most use cases. Handles bursts naturally while maintaining average rate.

from halt import Policy, Algorithm

policy = Policy(
    name="token_bucket",
    limit=100,        # 100 tokens per window
    window=60,        # 1 minute
    burst=120,        # Allow bursts up to 120
    algorithm=Algorithm.TOKEN_BUCKET,
)

Pros:

✅ Handles burst traffic naturally
✅ Smooth rate limiting
✅ Low memory usage

Cons:

❌ Slightly more complex than fixed window

Fixed Window

Simple and fast. Good for strict limits.

policy = Policy(
    name="fixed_window",
    limit=100,
    window=60,
    algorithm=Algorithm.FIXED_WINDOW,
)

Pros:

✅ Very simple
✅ Low memory usage
✅ Fast

Cons:

❌ Can allow 2x limit at window boundaries
❌ No burst handling

Sliding Window

Most accurate but uses more memory.

policy = Policy(
    name="sliding_window",
    limit=100,
    window=60,
    algorithm=Algorithm.SLIDING_WINDOW,
)

Pros:

✅ Most accurate
✅ No boundary issues

Cons:

❌ Higher memory usage
❌ Slightly slower

Key Strategies

IP-based (Default)

from halt import Policy, KeyStrategy

policy = Policy(
    name="per_ip",
    limit=100,
    window=60,
    key_strategy=KeyStrategy.IP,
)

# With trusted proxies (for X-Forwarded-For)
limiter = RateLimiter(
    store=store,
    policy=policy,
    trusted_proxies=["10.0.0.0/8", "172.16.0.0/12"],
)

User-based

policy = Policy(
    name="per_user",
    limit=1000,
    window=3600,
    key_strategy=KeyStrategy.USER,
)

Extracts user ID from:

request.user.id
request.state.user_id

API Key-based

policy = Policy(
    name="per_api_key",
    limit=5000,
    window=3600,
    key_strategy=KeyStrategy.API_KEY,
)

Extracts API key from headers:

X-API-Key
Authorization (including Bearer tokens)

Custom Key Extraction

def extract_org_id(request):
    """Extract organization ID from request."""
    return request.headers.get("X-Organization-ID")

policy = Policy(
    name="per_org",
    limit=10000,
    window=3600,
    key_strategy=KeyStrategy.CUSTOM,
    key_extractor=extract_org_id,
)

Exemptions

Automatic Exemptions

Halt automatically exempts:

Health Checks:

/health
/ping
/ready
/healthz
/livez

Private IPs:

127.0.0.1 (localhost)
10.0.0.0/8
172.16.0.0/12
192.168.0.0/16

Custom Exemptions

policy = Policy(
    name="custom",
    limit=100,
    window=60,
    exemptions=[
        "/admin",           # Path exemption
        "/internal",        # Another path
        "192.168.1.100",   # IP exemption
    ]
)

# Disable private IP exemptions
limiter = RateLimiter(
    store=store,
    policy=policy,
    exempt_private_ips=False,
)

Per-Route Rate Limiting

FastAPI - Dependency Injection

from fastapi import Depends
from halt.adapters.fastapi import create_limiter_dependency

# Create different limiters for different routes
public_limiter = RateLimiter(store=store, policy=presets.PUBLIC_API)
auth_limiter = RateLimiter(store=store, policy=presets.AUTH_ENDPOINTS)

public_limit = create_limiter_dependency(public_limiter)
auth_limit = create_limiter_dependency(auth_limiter)

@app.get("/api/data", dependencies=[Depends(public_limit)])
async def get_data():
    return {"data": "..."}

@app.post("/auth/login", dependencies=[Depends(auth_limit)])
async def login():
    return {"token": "..."}

Flask - Decorator

from halt.adapters.flask import limit

public_limiter = RateLimiter(store=store, policy=presets.PUBLIC_API)
auth_limiter = RateLimiter(store=store, policy=presets.AUTH_ENDPOINTS)

@app.route("/api/data")
@limit(public_limiter)
def get_data():
    return {"data": "..."}

@app.route("/auth/login", methods=["POST"])
@limit(auth_limiter)
def login():
    return {"token": "..."}

Response Headers

All responses include standard rate limit headers:

HTTP/1.1 200 OK
RateLimit-Limit: 100
RateLimit-Remaining: 95
RateLimit-Reset: 1708024800

When rate limited (429):

HTTP/1.1 429 Too Many Requests
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1708024860
Retry-After: 42

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. Please try again later.",
  "retry_after": 42
}

Advanced Usage

Dynamic Cost per Request

from fastapi import Request

@app.post("/api/llm")
async def llm_endpoint(request: Request):
    # Calculate cost based on request
    prompt_length = len(request.json().get("prompt", ""))
    cost = max(1, prompt_length // 100)  # 1 token per 100 chars
    
    # Check with custom cost
    decision = limiter.check(request, cost=cost)
    
    if not decision.allowed:
        raise HTTPException(status_code=429, detail="Rate limited")
    
    return {"response": "..."}

Multiple Policies

# Global rate limit
global_limiter = RateLimiter(store=store, policy=presets.GENEROUS_API)
app.add_middleware(HaltMiddleware, limiter=global_limiter)

# Endpoint-specific limits
auth_limiter = RateLimiter(store=store, policy=presets.AUTH_ENDPOINTS)
auth_limit = create_limiter_dependency(auth_limiter)

@app.post("/auth/login", dependencies=[Depends(auth_limit)])
async def login():
    # This endpoint has BOTH global AND auth limits
    return {"token": "..."}

Testing

import pytest
from halt import RateLimiter, InMemoryStore, Policy, Algorithm

def test_rate_limiting():
    policy = Policy(
        name="test",
        limit=5,
        window=60,
        algorithm=Algorithm.TOKEN_BUCKET,
    )
    
    limiter = RateLimiter(store=InMemoryStore(), policy=policy)
    
    # Mock request
    class MockRequest:
        def __init__(self):
            self.client = type('obj', (object,), {'host': '127.0.0.1'})
    
    request = MockRequest()
    
    # First 5 requests should succeed
    for i in range(5):
        decision = limiter.check(request)
        assert decision.allowed
    
    # 6th request should be blocked
    decision = limiter.check(request)
    assert not decision.allowed
    assert decision.retry_after > 0

Troubleshooting

Rate limits not working?

Check if request is exempted:
- Health check paths are auto-exempted
- Private IPs are auto-exempted (disable with exempt_private_ips=False)

Verify key extraction:

# Debug key extraction
key = limiter._extract_key(request)
print(f"Rate limit key: {key}")

Check storage:
- InMemoryStore doesn't persist across restarts
- Each process has its own memory store

Headers not appearing?

Make sure middleware is added correctly and responses are going through the middleware chain.

Different limits for same IP?

You might be using different policy names. Each policy maintains separate counters:

# These are SEPARATE limits
policy1 = Policy(name="api_v1", limit=100, window=60)
policy2 = Policy(name="api_v2", limit=100, window=60)

Performance

Token Bucket: ~0.1ms per check
Fixed Window: ~0.05ms per check
Sliding Window: ~0.2ms per check

All algorithms use O(1) memory per key (except Sliding Window which uses O(precision) per key).

License

MIT

Contributing

Contributions welcome! Please open an issue or PR on GitHub.

Roadmap

✅ Token Bucket algorithm
✅ Fixed Window algorithm
✅ Sliding Window algorithm
✅ In-memory storage
⏳ Redis storage
⏳ Distributed rate limiting
⏳ Tenant quotas
⏳ Abuse detection
⏳ Observability hooks

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Jun 20, 2026

This version

0.1.1

Feb 21, 2026

0.1.0

Feb 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halt_rate-0.1.1.tar.gz (29.9 kB view details)

Uploaded Feb 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

halt_rate-0.1.1-py3-none-any.whl (35.7 kB view details)

Uploaded Feb 21, 2026 Python 3

File details

Details for the file halt_rate-0.1.1.tar.gz.

File metadata

Download URL: halt_rate-0.1.1.tar.gz
Upload date: Feb 21, 2026
Size: 29.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for halt_rate-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`4564216070a472e8105b378c3f8b0e49a3376a8991ebbbc50a1d73cb9452b3ee`
MD5	`dd78e8dc4977a4ec546bb10712dde027`
BLAKE2b-256	`8e757e92f6713979dd1f91d28eeb63f27b7a6a535195b6cd3e70be169fe28d4a`

See more details on using hashes here.

File details

Details for the file halt_rate-0.1.1-py3-none-any.whl.

File metadata

Download URL: halt_rate-0.1.1-py3-none-any.whl
Upload date: Feb 21, 2026
Size: 35.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for halt_rate-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`017a9fafdc322fbf636d1d75deaa53e8c480dc1247848a6013c4a05253f6c1d3`
MD5	`1b5e936e87b0460dcd713fba303fa0d1`
BLAKE2b-256	`f5fea4aed1243f42503eaf191239f5d4f79412ee919f2390d9e2ae9111d09e14`

See more details on using hashes here.

halt-rate 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Halt Python SDK

Features

Installation

Optional Dependencies

Quick Start

FastAPI

Flask

Django

Preset Policies

SaaS Features

Plan-Based Rate Limiting

Quota Management

Penalty System

Telemetry & Observability

Custom Policies

Basic Custom Policy

Advanced Examples

Rate Limit by User

Rate Limit by API Key

Composite Keys (User + IP)

Weighted Endpoints

Algorithms

Token Bucket (Recommended)

Fixed Window

Sliding Window

Key Strategies

IP-based (Default)

User-based

API Key-based

Custom Key Extraction

Exemptions

Automatic Exemptions

Custom Exemptions

Per-Route Rate Limiting

FastAPI - Dependency Injection

Flask - Decorator

Response Headers

Advanced Usage

Dynamic Cost per Request

Multiple Policies

Testing

Troubleshooting

Rate limits not working?

Headers not appearing?

Different limits for same IP?

Performance

License

Contributing

Roadmap

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes