Skip to main content

Embedded persistent rate limiting for Python, powered by Rust

Project description

Flint

Persistent rate limiting, embedded first.

No Redis. No broker. No daemon required for embedded mode.

License: BSD-3-Clause CI PyPI Rust Python


Why Flint

I was tired of adding Redis just to rate limit a local Python service.

Most rate limiters are either in-memory and reset on restart, tied to HTTP proxies, or require a separate infrastructure service. Flint embeds inside the Python process and persists counter state to a local append-only log.

The result is simple:

Python process + .flint/ directory = durable rate limiting

What Flint Does

Flint is an embedded rate limiter with:

  • Rust core;
  • Python bindings via PyO3;
  • append-only AOF persistence;
  • AOF and snapshot checksum validation;
  • JSON snapshot and compaction;
  • single-writer data directory locking;
  • GIL-aware Python API;
  • CLI inspect/admin commands;
  • millisecond precision;
  • metrics counters;
  • Python decorator API;
  • FastAPI middleware;
  • Prometheus metrics export;
  • shared HTTP server mode;
  • Python SharedLimiter client;
  • configurable sync mode: always or batch;
  • asyncio-friendly wrapper methods;
  • atomic multi-limit checks with allow_all() / check_all();
  • token bucket algorithm;
  • sliding window log algorithm;
  • fixed window counter algorithm;
  • crash recovery from local files;
  • storage doctor checks;
  • v0.1 AOF compatibility for per_seconds entries;
  • no Redis, broker, or cloud dependency;
  • no daemon required for embedded mode.

Comparison

Solution Persistent No Redis Embedded Observable
Flint Yes Yes Yes Yes
SlowAPI No Yes Yes No
redis-py + limits No No No No
nginx rate limit No Yes No No

Flint's unique advantage is the combination of persistent counters and zero external infrastructure.


Install

pip install flint-limiter

The Python module is:

import flint

Quickstart

import flint

limiter = flint.Limiter(data_dir=".flint")

limiter.limit(
    "api:user-42",
    rate=100,
    per="1m",
    algorithm="token_bucket",
)

if limiter.allow("api:user-42"):
    process_request()

With context:

result = limiter.check("api:user-42", cost=1)

print(result.allowed)
print(result.cost)
print(result.remaining)
print(result.reset_at)

Cost-based checks:

# A normal request costs 1 unit.
limiter.check("api:user-42")

# Expensive work can consume more units from the same limit.
result = limiter.check("ai:user-42", cost=250)

if result.allowed:
    run_expensive_model_call()

A single check cost must be greater than zero and cannot exceed the configured rate capacity for that limit.

Atomic multi-limit checks:

result = limiter.check_all([
    "user:42",
    {"key": "org:acme", "cost": 10},
    ("endpoint:/v1/chat", 1),
])

if result["allowed"]:
    process_request()
else:
    print("blocked by", result["denied_key"])

check_all() is all-or-nothing. If any limit denies the request, Flint records the denied limit but does not consume quota from the other limits. allow_all() returns only the boolean decision:

if limiter.allow_all(["user:42", "org:acme", "endpoint:/v1/chat"]):
    process_request()

Performance sync mode:

# Default: safest mode, fsync after every persisted event.
safe = flint.Limiter(data_dir=".flint", sync="always")

# Higher throughput: write every event, fsync in batches.
fast = flint.Limiter(
    data_dir=".flint-fast",
    sync="batch",
    flush_every_ms=100,
    flush_every_events=100,
)

fast.limit("api:user-42", rate=100, per="1m")
fast.allow("api:user-42")
fast.flush()  # force fsync now

sync="batch" is useful for high-throughput services that can accept losing the last small batch of events after a hard crash or power loss. Process shutdown flushes pending events automatically; flush() can be called manually before critical boundaries.

Async Python API:

import asyncio
import flint

async def main():
    limiter = flint.Limiter(data_dir=".flint")

    await limiter.alimit("api:user-42", rate=100, per="1m")
    result = await limiter.acheck("api:user-42")

    if result.allowed:
        await process_request()

    status = await limiter.astatus("api:user-42")
    print(status["remaining"])

asyncio.run(main())

Async wrapper methods run the sync limiter call in a thread executor, so native async applications can use Flint without blocking the event loop directly. For route-level FastAPI limiting, prefer the middleware; for manual async decisions, use acheck() / aallow().

Millisecond precision:

limiter.limit("burst:login", rate=1, per="250ms")

Decorator:

@limiter.rate_limit("email:send", rate=10, per="1m", cost=1)
def send_email():
    ...

If the limit is exceeded, Flint raises:

flint.RateLimitExceeded

FastAPI Middleware

Install the optional FastAPI extra:

pip install "flint-limiter[fastapi]"

Static route limit:

from fastapi import FastAPI
import flint
from flint.fastapi import FlintRateLimitMiddleware

limiter = flint.Limiter(data_dir=".flint")
limiter.limit("route:/api", rate=100, per="1m")

app = FastAPI()
app.add_middleware(
    FlintRateLimitMiddleware,
    limiter=limiter,
    key="route:/api",
)

Dynamic per-client limit with lazy configuration:

app.add_middleware(
    FlintRateLimitMiddleware,
    limiter=limiter,
    key_func=lambda request: f"ip:{request.client.host}",
    rate=100,
    per="1m",
    exempt_paths={"/health", "/docs", "/openapi.json"},
)

Blocked requests return:

HTTP 429
{"detail": "rate limit exceeded"}

With headers:

X-RateLimit-Limit
X-RateLimit-Remaining
X-RateLimit-Reset
Retry-After

The middleware uses the same embedded persistent engine: no Redis, no daemon, no broker. Counters are stored in .flint/ and survive process restarts.


Shared Mode

Embedded mode is the default: one Python process owns .flint/ directly.

Shared mode runs one Flint server as the single writer, then lets multiple processes, workers, or services share the same persistent limits over HTTP.

Start the server:

flint --data-dir .flint-shared server start \
  --bind 127.0.0.1:7878 \
  --token dev-secret \
  --max-blocking 128 \
  --sync batch \
  --flush-every-ms 100 \
  --flush-every-events 100

Use it from Python:

import flint

limiter = flint.SharedLimiter(
    "http://127.0.0.1:7878",
    token="dev-secret",
    timeout=10.0,
)

limiter.limit("api:user-42", rate=100, per="1m")

if limiter.allow("api:user-42"):
    process_request()

Shared mode is useful when:

  • a FastAPI app has multiple worker processes;
  • several local services need the same quota;
  • a CLI, background worker, and web process must inspect the same counters;
  • you want persistent rate limiting without giving every process write access to the same .flint/ directory.

HTTP API:

Endpoint Method Purpose
/v1/health GET health check
/v1/limits GET list limits
/v1/limits POST configure a limit
/v1/limits/{key} GET limit status
/v1/check POST check/consume one limit
/v1/check-all POST atomic multi-limit check
/v1/reset POST reset a limit
/v1/log/flush POST force pending batch writes to disk
/v1/log/compact POST compact AOF into snapshot
/v1/doctor GET storage/runtime health

When --token is set, every request must include:

Authorization: Bearer <token>

Flint refuses to bind the shared server to a non-loopback address such as 0.0.0.0 unless a token is configured. Storage operations run on bounded blocking workers controlled by --max-blocking, so persistent writes do not block the async HTTP runtime. Server mode supports the same storage sync modes as embedded mode: --sync always for maximum durability and --sync batch for higher throughput.

For public or enterprise network exposure, put Flint behind a reverse proxy, service mesh, or load balancer that handles TLS/mTLS, certificate rotation, and network policy. See the Security Guide.

Shared mode keeps Flint's core model simple: one writer owns the data directory; other processes use the server API instead of opening the same files directly.


Prometheus Metrics

Flint can export limiter state in Prometheus text format:

import flint

limiter = flint.Limiter(data_dir=".flint")

metrics_text = flint.prometheus_metrics(limiter)

FastAPI endpoint:

from fastapi import FastAPI
import flint

limiter = flint.Limiter(data_dir=".flint")
app = FastAPI()

flint.add_prometheus_route(app, limiter, path="/metrics")

For high-cardinality keys such as users, IPs, API keys, or tenant IDs, avoid exporting the raw key as a Prometheus label:

metrics_text = flint.prometheus_metrics(
    limiter,
    include_key_label=False,
)

flint.add_prometheus_route(
    app,
    limiter,
    path="/metrics",
    include_key_label=False,
)

Or bucket/redact keys before export:

metrics_text = flint.prometheus_metrics(
    limiter,
    key_label_func=lambda key: key.split(":")[0],
)

Example output:

# HELP flint_requests_allowed_total Total allowed checks.
# TYPE flint_requests_allowed_total counter
flint_requests_allowed_total{key="route:/api",algorithm="token_bucket"} 42
flint_requests_denied_total{key="route:/api",algorithm="token_bucket"} 3
flint_limit_remaining{key="route:/api",algorithm="token_bucket"} 58

Exported metrics include:

flint_limit_info
flint_limit_rate
flint_limit_per_millis
flint_limit_remaining
flint_limit_reset_at_seconds
flint_requests_allowed_total
flint_requests_denied_total
flint_request_cost_allowed_total
flint_request_cost_denied_total

CLI

flint limit add "api:user-42" --rate 100 --per 1m --algorithm token_bucket
flint limit list
flint limit status "api:user-42"
flint limit check "api:user-42" --cost 5
flint limit check-all "user:42" "org:acme" --cost org:acme=10
flint limit reset "api:user-42"
flint limit history "api:user-42"
flint limit top --by denied --limit 20
flint log compact
flint doctor
flint server start --bind 127.0.0.1:7878 --token dev-secret
flint server start --bind 127.0.0.1:7878 --token dev-secret --sync batch

Use a custom data directory:

flint --data-dir /var/lib/myapp/flint limit status "api:user-42"

Algorithms

Algorithm Use case
token_bucket Smooth rate limiting with bursts
sliding_window_log Precise rolling-window limits
fixed_window_counter Simple high-throughput window counters

Storage

Flint stores state under data_dir:

.flint/
  flint.aof
  flint.snapshot
  flint.lock

The AOF records durable events:

LIMIT_CONFIGURED
ALLOW
ALLOW_ALL
DENY
RESET

On restart, Flint loads flint.snapshot when present, replays the AOF tail, and restores counters. A crash-truncated final line is ignored deterministically; corruption in the middle of the log fails loudly.

New AOF records include a SHA-256 checksum of the stored event payload. New snapshots are written inside a checksum envelope. Older v0.1/v0.2 files without checksums remain readable, while checksum mismatches fail startup loudly instead of silently accepting tampered state.

flint doctor validates the local storage files and reports the number of limits, history events, AOF bytes, and whether a snapshot is present.

Flint v0.2 uses millisecond precision internally. Older v0.1 AOF entries that stored per_seconds are migrated during replay by converting seconds to milliseconds.

Sync Modes

Mode Behavior Use case
always flush + fsync after every event maximum durability
batch write every event, fsync every N events or N ms higher throughput

always is the default. batch keeps writes append-only but delays fsync, so a hard crash can lose the last unsynced batch. Normal process shutdown and flush() force pending events to disk.

Metrics exposed by status() and list():

total_allowed
total_denied
total_allowed_cost
total_denied_cost
last_allowed_at
last_denied_at
last_reset_at
remaining
reset_at

cost is included in check results and rate-limit exceptions.


What Flint Replaces

Flint replaces:

  • Redis-backed rate limiting libraries;
  • in-memory Python limiters that reset on restart;
  • nginx-only HTTP rate limiting;
  • custom database counters;
  • hand-written local counters with no history.

The unique property is persistent rate limiting without Redis.


Reliability Checks

The current test suite covers the core failure paths for an embedded persistent limiter:

  • exclusive data directory locking;
  • concurrent checks on the same key;
  • 10,000 configured limits;
  • snapshot and compaction preserving status and metrics;
  • recovery from append-only log;
  • deterministic rejection of corrupted middle log records;
  • checksum rejection for tampered AOF records and snapshots;
  • v0.1 per_seconds log migration to v0.2 per_millis;
  • cost-based checks across token bucket, fixed window, and sliding window;
  • atomic multi-limit checks with no partial quota consumption;
  • FastAPI middleware static keys, dynamic keys, lazy config, weighted cost, and exempt paths;
  • Prometheus text export and FastAPI /metrics route;
  • Python decorator allowed/denied behavior;
  • RateLimitExceeded metadata;
  • CLI compact, doctor, and top;
  • Criterion benchmarks for hot path checks, many keys, AOF replay, and compaction.

Benchmarks

Flint includes Criterion benchmarks for the core Rust engine:

cargo bench -p flint-core --bench limiter

The benchmark suite covers:

  • token bucket, sliding window, and fixed window check hot paths;
  • cost-based checks;
  • atomic check_all() across multiple limits;
  • configuring 1,000 and 10,000 keys;
  • reopening from AOF with 1,000 and 10,000 events;
  • compacting AOF into a snapshot.

For a quick smoke run:

cargo bench -p flint-core --bench limiter -- --quick

Latest local quick run:

Benchmark Result
token bucket persistent check ~556 us
sliding window persistent check ~575 us
fixed window persistent check ~592 us
cost-based token bucket check ~569 us
check_all() over 3 limits ~552 us
configure 1,000 keys ~584 ms
configure 10,000 keys ~5.60 s
reopen from 1,000 AOF events ~4.59 ms
reopen from 10,000 AOF events ~43.0 ms
compact 1,000 AOF events ~31.3 ms
compact 10,000 AOF events ~347 ms

These numbers are from a local quick run and are mainly useful as a regression baseline. Full Criterion runs should be used when comparing releases or storage changes.


Documentation


Build And Test

cargo fmt --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace

Python:

python3 -m venv .venv
.venv/bin/pip install -U pip maturin pytest
.venv/bin/maturin develop
.venv/bin/python -m pytest -q tests/python

License

BSD 3-Clause.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flint_limiter-0.2.1.tar.gz (44.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flint_limiter-0.2.1-cp312-cp312-manylinux_2_34_x86_64.whl (565.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

File details

Details for the file flint_limiter-0.2.1.tar.gz.

File metadata

  • Download URL: flint_limiter-0.2.1.tar.gz
  • Upload date:
  • Size: 44.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for flint_limiter-0.2.1.tar.gz
Algorithm Hash digest
SHA256 bab423f6ad59679a8e668690c063cca39a99174d81046b75466273037c10d069
MD5 bd67f6b0acf34a976155edbf772b04fd
BLAKE2b-256 3f8833462919eedee522fb769ef4ef3e7298c9b8872dcc9d3461a2514851db51

See more details on using hashes here.

File details

Details for the file flint_limiter-0.2.1-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for flint_limiter-0.2.1-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 43b9ce9163044acadbee92b1fbd868c9eb042e863b04705f7cb8809fe6ed7106
MD5 4bb965739b748f760777453e87132ed0
BLAKE2b-256 586ee6551fefa32335f74499e7334621956aeb6e45732f52cb82dfd60bba9ee7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page