Semaphore-based concurrency limiter for FastAPI
Project description
fastapi-concurrency-limiter
Semaphore-based concurrency limiter for FastAPI. Define named resources with a maximum concurrency, apply them to routes with a decorator, and get automatic 503 responses when the server is too busy.
Why
When a shared resource — such as a database or filesystem — receives more concurrent requests than it can handle, its performance degrades. This overload can cascade across the entire application, starving unrelated endpoints of capacity.
The classic solution is rate limiting (e.g. fastapi-limiter), which caps how many requests an endpoint accepts per unit of time. This library takes a different approach: concurrency limiting. Each resource is backed by a semaphore that allows only a fixed number of requests to run simultaneously. Excess requests queue in Python's async runtime rather than being rejected immediately, and a configurable timeout ensures that requests which wait too long are terminated with an HTTP 503 instead of going stale.
Example with benchmark
The API used for this test has two endpoints:
messages— performs a database queryfile— reads from the filesystem
The benchmark is implemented with Grafana k6.
File endpoint:
- 2 virtual users (VUs) continuously hit the
fileendpoint throughout the entire test as a baseline
Messages endpoint:
- 10 VUs loop on the
messagesendpoint to show normal operation - VU count on
messagesspikes to 150 to simulate overload - VU count drops back to 10 to observe recovery
Without limiting
The overloaded messages endpoint heavily affects the file endpoint. At peak load there are virtually no successful requests from either endpoint, because responses arrive too late for the client. When load drops, the system takes around 10 seconds to recover — stale requests queued during the spike must drain before throughput returns to normal.
With limiting
Even at steady state, performance is better: capping the database to 5 concurrent requests lets it operate more efficiently and use fewer resources, which benefits the file endpoint as well. When load spikes, the file endpoint is unaffected — system load stays bounded. Excess requests are rejected with HTTP 503 before they can go stale, so the error count stays well below the request count and errors are predictable 503s rather than opaque timeouts. When load returns to normal, throughput recovers immediately because there is no backlog of stale requests.
Installation
pip install fastapi-concurrency-limiter
Usage
from fastapi import FastAPI
from fastapi_concurrency_limiter import Limiter, Resource
app = FastAPI()
database_resource = Resource(5) # max 5 concurrent DB operations
file_resource = Resource(10) # max 10 concurrent file operations
# Register resources with the limiter in the order they should be acquired.
# Always use this same order in @limiter.resources() to prevent deadlocks.
limiter = Limiter(timeout=5, resources=[database_resource, file_resource])
@app.get("/messages")
@limiter.resources([database_resource])
async def read_messages():
...
@app.get("/file")
@limiter.resources([file_resource])
async def read_file():
...
@app.get("/messages-and-file")
@limiter.resources([database_resource, file_resource]) # must match registration order
async def read_messages_and_file():
...
When a request cannot acquire a semaphore within timeout seconds it receives:
{"detail": "Server busy, please retry later"}
with HTTP status 503.
Resource ordering and deadlocks
When an endpoint acquires multiple resources, always pass them to @limiter.resources() in the same order they were registered in Limiter(resources=[...]). Passing them in a different order raises a ValueError at startup — this is intentional: consistent acquisition order is the simplest way to prevent deadlocks.
limiter = Limiter(timeout=5, resources=[db, fs])
@app.get("/ok")
@limiter.resources([db, fs]) # correct
async def ok(): ...
@app.get("/bad")
@limiter.resources([fs, db]) # raises ValueError at startup
async def bad(): ...
API
Resource(capacity: int)
A semaphore with the given concurrency limit.
Limiter(timeout: float, resources: list[Resource])
timeout— seconds to wait for semaphore acquisition before returning 503.resources— all resources managed by this limiter, in canonical acquisition order.
@limiter.resources(resources: list[Resource])
Route decorator. Acquires the listed resources before the handler runs and releases them after, even on exception.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastapi_concurrency_limiter-0.1.1.tar.gz.
File metadata
- Download URL: fastapi_concurrency_limiter-0.1.1.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.14.4 Linux/7.0.0-27-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb4692492fab526ad5a2e05e53a7eba3ba2f6947c1b3c015efde93e7143b08f3
|
|
| MD5 |
df9921658cd31c2cbe187fd276dc6cd0
|
|
| BLAKE2b-256 |
a38037a40d5d86aa68f90445ec2c84bbc23ab25e3488daf1cacea18386447d9e
|
File details
Details for the file fastapi_concurrency_limiter-0.1.1-py3-none-any.whl.
File metadata
- Download URL: fastapi_concurrency_limiter-0.1.1-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.14.4 Linux/7.0.0-27-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6891e6dca7f819f9ab1b5348b8d956f3af28d45044dcfbbb4e7b4bb1eba2cf7b
|
|
| MD5 |
b09cb3ee312a0ae4dea0aaebbeacb3ac
|
|
| BLAKE2b-256 |
3e672c173fac8237c6fc51ba44c8597cb792d53ef1f64a0399216fdda0c2b275
|