Embedded persistent rate limiting for Python, powered by Rust
Project description
Flint
Persistent rate limiting, embedded first.
No Redis. No broker. No daemon required for embedded mode.
Why Flint
I was tired of adding Redis just to rate limit a local Python service.
Most rate limiters are either in-memory and reset on restart, tied to HTTP proxies, or require a separate infrastructure service. Flint embeds inside the Python process and persists counter state to a local append-only log.
The result is simple:
Python process + .flint/ directory = durable rate limiting
What Flint Does
Flint is an embedded rate limiter with:
- Rust core;
- Python bindings via PyO3;
- append-only AOF persistence;
- AOF and snapshot checksum validation;
- JSON snapshot and compaction;
- single-writer data directory locking;
- GIL-aware Python API;
- CLI inspect/admin commands;
- millisecond precision;
- metrics counters;
- Python decorator API;
- FastAPI middleware;
- Prometheus metrics export;
- shared HTTP server mode;
- Python
SharedLimiterclient; - configurable sync mode:
alwaysorbatch; - asyncio-friendly wrapper methods;
- atomic multi-limit checks with
allow_all()/check_all(); - token bucket algorithm;
- sliding window log algorithm;
- fixed window counter algorithm;
- crash recovery from local files;
- storage doctor checks;
- v0.1 AOF compatibility for
per_secondsentries; - no Redis, broker, or cloud dependency;
- no daemon required for embedded mode.
Comparison
| Solution | Persistent | No Redis | Embedded | Observable |
|---|---|---|---|---|
| Flint | Yes | Yes | Yes | Yes |
| SlowAPI | No | Yes | Yes | No |
| redis-py + limits | No | No | No | No |
| nginx rate limit | No | Yes | No | No |
Flint's unique advantage is the combination of persistent counters and zero external infrastructure.
Install
pip install flint-limiter
The Python module is:
import flint
Quickstart
import flint
limiter = flint.Limiter(data_dir=".flint")
limiter.limit(
"api:user-42",
rate=100,
per="1m",
algorithm="token_bucket",
)
if limiter.allow("api:user-42"):
process_request()
With context:
result = limiter.check("api:user-42", cost=1)
print(result.allowed)
print(result.cost)
print(result.remaining)
print(result.reset_at)
Cost-based checks:
# A normal request costs 1 unit.
limiter.check("api:user-42")
# Expensive work can consume more units from the same limit.
result = limiter.check("ai:user-42", cost=250)
if result.allowed:
run_expensive_model_call()
A single check cost must be greater than zero and cannot exceed the configured rate capacity for that limit.
Atomic multi-limit checks:
result = limiter.check_all([
"user:42",
{"key": "org:acme", "cost": 10},
("endpoint:/v1/chat", 1),
])
if result["allowed"]:
process_request()
else:
print("blocked by", result["denied_key"])
check_all() is all-or-nothing. If any limit denies the request, Flint records
the denied limit but does not consume quota from the other limits. allow_all()
returns only the boolean decision:
if limiter.allow_all(["user:42", "org:acme", "endpoint:/v1/chat"]):
process_request()
Performance sync mode:
# Default: safest mode, fsync after every persisted event.
safe = flint.Limiter(data_dir=".flint", sync="always")
# Higher throughput: write every event, fsync in batches.
fast = flint.Limiter(
data_dir=".flint-fast",
sync="batch",
flush_every_ms=100,
flush_every_events=100,
)
fast.limit("api:user-42", rate=100, per="1m")
fast.allow("api:user-42")
fast.flush() # force fsync now
sync="batch" is useful for high-throughput services that can accept losing the
last small batch of events after a hard crash or power loss. Process shutdown
flushes pending events automatically; flush() can be called manually before
critical boundaries.
Async Python API:
import asyncio
import flint
async def main():
limiter = flint.Limiter(data_dir=".flint")
await limiter.alimit("api:user-42", rate=100, per="1m")
result = await limiter.acheck("api:user-42")
if result.allowed:
await process_request()
status = await limiter.astatus("api:user-42")
print(status["remaining"])
asyncio.run(main())
Async wrapper methods run the sync limiter call in a thread executor, so native
async applications can use Flint without blocking the event loop directly. For
route-level FastAPI limiting, prefer the middleware; for manual async decisions,
use acheck() / aallow().
Millisecond precision:
limiter.limit("burst:login", rate=1, per="250ms")
Decorator:
@limiter.rate_limit("email:send", rate=10, per="1m", cost=1)
def send_email():
...
If the limit is exceeded, Flint raises:
flint.RateLimitExceeded
FastAPI Middleware
Install the optional FastAPI extra:
pip install "flint-limiter[fastapi]"
Static route limit:
from fastapi import FastAPI
import flint
from flint.fastapi import FlintRateLimitMiddleware
limiter = flint.Limiter(data_dir=".flint")
limiter.limit("route:/api", rate=100, per="1m")
app = FastAPI()
app.add_middleware(
FlintRateLimitMiddleware,
limiter=limiter,
key="route:/api",
)
Dynamic per-client limit with lazy configuration:
app.add_middleware(
FlintRateLimitMiddleware,
limiter=limiter,
key_func=lambda request: f"ip:{request.client.host}",
rate=100,
per="1m",
exempt_paths={"/health", "/docs", "/openapi.json"},
)
Blocked requests return:
HTTP 429
{"detail": "rate limit exceeded"}
With headers:
X-RateLimit-Limit
X-RateLimit-Remaining
X-RateLimit-Reset
Retry-After
The middleware uses the same embedded persistent engine: no Redis, no daemon, no
broker. Counters are stored in .flint/ and survive process restarts.
Shared Mode
Embedded mode is the default: one Python process owns .flint/ directly.
Shared mode runs one Flint server as the single writer, then lets multiple processes, workers, or services share the same persistent limits over HTTP.
Start the server:
flint --data-dir .flint-shared server start \
--bind 127.0.0.1:7878 \
--token dev-secret \
--max-blocking 128 \
--sync batch \
--flush-every-ms 100 \
--flush-every-events 100
Use it from Python:
import flint
limiter = flint.SharedLimiter(
"http://127.0.0.1:7878",
token="dev-secret",
timeout=10.0,
)
limiter.limit("api:user-42", rate=100, per="1m")
if limiter.allow("api:user-42"):
process_request()
Shared mode is useful when:
- a FastAPI app has multiple worker processes;
- several local services need the same quota;
- a CLI, background worker, and web process must inspect the same counters;
- you want persistent rate limiting without giving every process write access
to the same
.flint/directory.
HTTP API:
| Endpoint | Method | Purpose |
|---|---|---|
/v1/health |
GET |
health check |
/v1/limits |
GET |
list limits |
/v1/limits |
POST |
configure a limit |
/v1/limits/{key} |
GET |
limit status |
/v1/check |
POST |
check/consume one limit |
/v1/check-all |
POST |
atomic multi-limit check |
/v1/reset |
POST |
reset a limit |
/v1/log/flush |
POST |
force pending batch writes to disk |
/v1/log/compact |
POST |
compact AOF into snapshot |
/v1/doctor |
GET |
storage/runtime health |
When --token is set, every request must include:
Authorization: Bearer <token>
Flint refuses to bind the shared server to a non-loopback address such as
0.0.0.0 unless a token is configured. Storage operations run on bounded
blocking workers controlled by --max-blocking, so persistent writes do not
block the async HTTP runtime. Server mode supports the same storage sync modes
as embedded mode: --sync always for maximum durability and --sync batch for
higher throughput.
For public or enterprise network exposure, put Flint behind a reverse proxy, service mesh, or load balancer that handles TLS/mTLS, certificate rotation, and network policy. See the Security Guide.
Shared mode keeps Flint's core model simple: one writer owns the data directory; other processes use the server API instead of opening the same files directly.
Prometheus Metrics
Flint can export limiter state in Prometheus text format:
import flint
limiter = flint.Limiter(data_dir=".flint")
metrics_text = flint.prometheus_metrics(limiter)
FastAPI endpoint:
from fastapi import FastAPI
import flint
limiter = flint.Limiter(data_dir=".flint")
app = FastAPI()
flint.add_prometheus_route(app, limiter, path="/metrics")
For high-cardinality keys such as users, IPs, API keys, or tenant IDs, avoid exporting the raw key as a Prometheus label:
metrics_text = flint.prometheus_metrics(
limiter,
include_key_label=False,
)
flint.add_prometheus_route(
app,
limiter,
path="/metrics",
include_key_label=False,
)
Or bucket/redact keys before export:
metrics_text = flint.prometheus_metrics(
limiter,
key_label_func=lambda key: key.split(":")[0],
)
Example output:
# HELP flint_requests_allowed_total Total allowed checks.
# TYPE flint_requests_allowed_total counter
flint_requests_allowed_total{key="route:/api",algorithm="token_bucket"} 42
flint_requests_denied_total{key="route:/api",algorithm="token_bucket"} 3
flint_limit_remaining{key="route:/api",algorithm="token_bucket"} 58
Exported metrics include:
flint_limit_info
flint_limit_rate
flint_limit_per_millis
flint_limit_remaining
flint_limit_reset_at_seconds
flint_requests_allowed_total
flint_requests_denied_total
flint_request_cost_allowed_total
flint_request_cost_denied_total
CLI
flint limit add "api:user-42" --rate 100 --per 1m --algorithm token_bucket
flint limit list
flint limit status "api:user-42"
flint limit check "api:user-42" --cost 5
flint limit check-all "user:42" "org:acme" --cost org:acme=10
flint limit reset "api:user-42"
flint limit history "api:user-42"
flint limit top --by denied --limit 20
flint log compact
flint doctor
flint server start --bind 127.0.0.1:7878 --token dev-secret
flint server start --bind 127.0.0.1:7878 --token dev-secret --sync batch
Use a custom data directory:
flint --data-dir /var/lib/myapp/flint limit status "api:user-42"
Algorithms
| Algorithm | Use case |
|---|---|
token_bucket |
Smooth rate limiting with bursts |
sliding_window_log |
Precise rolling-window limits |
fixed_window_counter |
Simple high-throughput window counters |
Storage
Flint stores state under data_dir:
.flint/
flint.aof
flint.snapshot
flint.lock
The AOF records durable events:
LIMIT_CONFIGURED
ALLOW
ALLOW_ALL
DENY
RESET
On restart, Flint loads flint.snapshot when present, replays the AOF tail, and
restores counters. A crash-truncated final line is ignored deterministically;
corruption in the middle of the log fails loudly.
New AOF records include a SHA-256 checksum of the stored event payload. New snapshots are written inside a checksum envelope. Older v0.1/v0.2 files without checksums remain readable, while checksum mismatches fail startup loudly instead of silently accepting tampered state.
flint doctor validates the local storage files and reports the number of
limits, history events, AOF bytes, and whether a snapshot is present.
Flint v0.2 uses millisecond precision internally. Older v0.1 AOF entries that
stored per_seconds are migrated during replay by converting seconds to
milliseconds.
Sync Modes
| Mode | Behavior | Use case |
|---|---|---|
always |
flush + fsync after every event | maximum durability |
batch |
write every event, fsync every N events or N ms | higher throughput |
always is the default. batch keeps writes append-only but delays fsync, so a
hard crash can lose the last unsynced batch. Normal process shutdown and
flush() force pending events to disk.
Metrics exposed by status() and list():
total_allowed
total_denied
total_allowed_cost
total_denied_cost
last_allowed_at
last_denied_at
last_reset_at
remaining
reset_at
cost is included in check results and rate-limit exceptions.
What Flint Replaces
Flint replaces:
- Redis-backed rate limiting libraries;
- in-memory Python limiters that reset on restart;
- nginx-only HTTP rate limiting;
- custom database counters;
- hand-written local counters with no history.
The unique property is persistent rate limiting without Redis.
Reliability Checks
The current test suite covers the core failure paths for an embedded persistent limiter:
- exclusive data directory locking;
- concurrent checks on the same key;
- 10,000 configured limits;
- snapshot and compaction preserving status and metrics;
- recovery from append-only log;
- deterministic rejection of corrupted middle log records;
- checksum rejection for tampered AOF records and snapshots;
- v0.1
per_secondslog migration to v0.2per_millis; - cost-based checks across token bucket, fixed window, and sliding window;
- atomic multi-limit checks with no partial quota consumption;
- FastAPI middleware static keys, dynamic keys, lazy config, weighted cost, and exempt paths;
- Prometheus text export and FastAPI
/metricsroute; - Python decorator allowed/denied behavior;
RateLimitExceededmetadata;- CLI
compact,doctor, andtop; - Criterion benchmarks for hot path checks, many keys, AOF replay, and compaction.
Benchmarks
Flint includes Criterion benchmarks for the core Rust engine:
cargo bench -p flint-core --bench limiter
The benchmark suite covers:
- token bucket, sliding window, and fixed window check hot paths;
- cost-based checks;
- atomic
check_all()across multiple limits; - configuring 1,000 and 10,000 keys;
- reopening from AOF with 1,000 and 10,000 events;
- compacting AOF into a snapshot.
For a quick smoke run:
cargo bench -p flint-core --bench limiter -- --quick
Latest local quick run:
| Benchmark | Result |
|---|---|
| token bucket persistent check | ~556 us |
| sliding window persistent check | ~575 us |
| fixed window persistent check | ~592 us |
| cost-based token bucket check | ~569 us |
check_all() over 3 limits |
~552 us |
| configure 1,000 keys | ~584 ms |
| configure 10,000 keys | ~5.60 s |
| reopen from 1,000 AOF events | ~4.59 ms |
| reopen from 10,000 AOF events | ~43.0 ms |
| compact 1,000 AOF events | ~31.3 ms |
| compact 10,000 AOF events | ~347 ms |
These numbers are from a local quick run and are mainly useful as a regression baseline. Full Criterion runs should be used when comparing releases or storage changes.
Documentation
Build And Test
cargo fmt --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
Python:
python3 -m venv .venv
.venv/bin/pip install -U pip maturin pytest
.venv/bin/maturin develop
.venv/bin/python -m pytest -q tests/python
License
BSD 3-Clause.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flint_limiter-0.2.1.tar.gz.
File metadata
- Download URL: flint_limiter-0.2.1.tar.gz
- Upload date:
- Size: 44.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bab423f6ad59679a8e668690c063cca39a99174d81046b75466273037c10d069
|
|
| MD5 |
bd67f6b0acf34a976155edbf772b04fd
|
|
| BLAKE2b-256 |
3f8833462919eedee522fb769ef4ef3e7298c9b8872dcc9d3461a2514851db51
|
File details
Details for the file flint_limiter-0.2.1-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: flint_limiter-0.2.1-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 565.6 kB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43b9ce9163044acadbee92b1fbd868c9eb042e863b04705f7cb8809fe6ed7106
|
|
| MD5 |
4bb965739b748f760777453e87132ed0
|
|
| BLAKE2b-256 |
586ee6551fefa32335f74499e7334621956aeb6e45732f52cb82dfd60bba9ee7
|