Skip to main content

Distributed async rate limiters, using Redis

Project description




distributed async client rate limiters


PyPI test status coverage python version

This package implements an async distributed semaphore, and the token bucket algorithm. Both implementations are FIFO, and require redis.

Between them, the two implementations make it possible to regulate traffic with respect to:

  • Concurrency-based limits (5 active requests at the same time), or

  • Time-based limits (5 requests every 10 seconds)

This was written for rate-limiting, but the semaphore and token bucket implementations can be used for anything.

Installation

pip install self-limiters

Usage

Both implementations are written as async context managers.

Semaphore

The Semaphore context manager is used as follows:

from self_limiters import Semaphore


# allows 10 concurrent requests
concurrency_limited_queue = Semaphore(  # <-- init
    name="foo",
    capacity=10,
    max_sleep=15,
    redis_url="redis://localhost:6379"
)

async def main():
    async with concurrency_limited_queue:  # <-- usage
        client.get(...)

A MaxSleepExceededError is raised after max_sleep seconds, if max_sleep is specified.

Token bucket

The TokenBucket context manager is used like this:

from self_limiters import TokenBucket

# allows 10 requests per minute
time_limited_queue = TokenBucket(  # <-- init
    name="unique-resource-name",
    capacity=10,
    refill_frequency=60,
    refill_amount=10,  #
    max_sleep=600,
    redis_url="redis://localhost:6379"
)

async def main():
    async with time_limited_queue:  # <-- usage
        client.get(...)

If max_sleep is set and the sleep time exceeds max-sleep a MaxSleepExceededError is raised.

Decorators

The package doesn't ship any decorators, but if you want to apply these to a function, you can roll your own, like this:

from self_limiters import Semaphore


def limit(name, capacity):
    # Initialize semaphore or token bucket
    limiter = Semaphore(name=name, capacity=capacity, redis_url="redis://127.0.0.1:6389")

    def middle(f):
        async def inner(*args, **kwargs):
            # Use context manager within the decorator
            async with limiter:
                return await f(*args, **kwargs)
        return inner
    return middle


# The decorator will then ensure we sleep until ready
@limit(name="my-user-api", capacity=50)
def retrieve_user(id: UUID) -> User:
    ...

Implementation breakdown

The package was written for performance, and it seems that the most performant way to implement these algorithms is by leveraging Lua scripts. I initially wrote this in pure Rust, but Lua scripts present a couple of really great benefits:

  • Lua scripts are executed on a redis instance, and lets us implement close to the entire implementation logic in a single script. This means our client can make 1 request to redis to run the script instead of 1 request per redis call needed. The time saved by reducing the number of requests is huge.

  • The initial rust implementation (pre-lua scripts) had to use the redlock algorithm to ensure fairness. With Lua scripts (since redis instance are single threaded), our implementations are FIFO out of the box.

    This makes our implementation a lot faster, since we no longer need locks, and it simultaneously makes the code much, much simpler.

With Lua scripts, this is how our flows ended up being:

The semaphore implementation

  1. Run create_semaphore.lua to create a list, which will be the foundation of our semaphore. This is skipped if it has already been created.
  2. Run BLPOP to non-blockingly wait until the semaphore has capacity. When it does, we pop from the list.
  3. Then run another release_semaphore.lua to "release" the semaphore by adding back the capacity we popped.

So in total we make 3 calls to redis (we would have made 6 without the scripts), which are all non-blocking.

The token bucket implementation

Here, things are even simpler. The steps are:

  1. Run schedule.lua to retrieve a wake-up time.
  2. Sleep until then.

We make 1 call instead of 3, and both of these are also non-blocking.

In other words, the limiters' impact on an application event-loop should be completely negligible.

Benchmarks

When testing on Github actions we get sub-millisecond runtimes:

  • processing 100 nodes with the semaphore implementation takes ~0.6ms per instance
  • processing 100 nodes with the token bucket implementation takes ~0.03ms per instance

The majority of this should also still be waiting for i/o.

Implementation details

This section breaks down how both implementations are written, in detail. It assumes you know how a Python (async) context manager works. If this is new to you, take a look here first!

The semaphore implementation

The semaphore implementation is useful when you need to limit a process to n concurrent actions. For example if you have 10 web servers, and you're interacting with an API that will only tolerate 5 concurrent requests before locking you out.

The flow can be broken down as follows:

  • The create_semaphore Lua script calls SETNX on the key of the queue plus a postfix (if the name specified in the class instantiation is "my-queue", then the queue name will be __self-limiters:my-queue and setnx will be called for __self-limiters:my-queue-exists). If the returned value is 1 it means the queue we will use for our semaphore does not exist yet and needs to be created.

    (It might strike you as weird to maintain a separate value, just to indicate whether a list exists, when we could just check the list itself. It would be nice if we could use EXISTS on the list directly, but unfortunately a list is considered not to exist when all elements are popped (i.e., when a semaphore is fully acquired), so I don't see another way of doing this. Contributions are very welcome if you do!

  • Then if the queue needs to be created we call RPUSH with the number of arguments equal to the capacity value used when initializing the semaphore instance. For a semaphore with a capacity of 5, we call RPUSH 1 1 1 1 1, where the values are completely arbitrary.

  • Once the list/queue has been created, we call BLPOP to block until it's our turn. BLPOP is FIFO by default. We also make sure to specify the max_sleep based on the initialized semaphore instance setting. If nothing was passed we allow sleeping forever.

  • On __aexit__ we call another script to RPUSH a 1 value back into the queue (i.e., release the semaphore) and set an expiry on the queue and the string value we called SETNX on.

    The expires are a half measure for dealing with dropped capacity. If a node holding the semaphore dies, the capacity might never be returned. If, however, there is no one using the semaphore for the duration of the expiry value, all values will be cleared, and the semaphore will be recreated at full capacity next time it's used. The expiry is 30 seconds at the time of writing, but could be made configurable.

The token bucket implementation

The token bucket implementation is useful when you need to limit a process by time. For example, to 1 request per minute, or 500 every 10 seconds.

The implementation is forward-looking. It works out the time there would have been capacity in the bucket for a given client and returns that time. From there we can asynchronously sleep until it's time to perform our rate limited action.

The flow can be broken down as follows:

  • Call the schedule Lua script which first GETs the state of the bucket.

    The bucket state contains the last slot scheduled and the number of tokens left for that slot. With a capacity of 1, having a tokens_left_for_slot variable makes no sense, but if there's capacity of 2 or more, it is possible that we will need to schedule multiple clients to the same slot.

    The script then works out whether to decrement the tokens_left_for_slot value, or to increment the slot value wrt. the frequency variable.

    Finally, we store the bucket state again using SETEX. This allows us to store the state and set expiry at the same time. The default expiry is 30 at the time of writing, but could be made configurable.

    One thing to note, is that this would not work if it wasn't for the fact that redis is single threaded, so Lua scripts on Redis are FIFO. Without this we would need locks and a lot more logic.

  • Then we just sleep. Simple!

Contributing

Please do!

Feedback on the implementation, issues, and PRs are welcome. See CONTRIBUTING.md for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

self_limiters-0.1.2-cp38-abi3-win_amd64.whl (588.0 kB view details)

Uploaded CPython 3.8+ Windows x86-64

self_limiters-0.1.2-cp38-abi3-win32.whl (562.2 kB view details)

Uploaded CPython 3.8+ Windows x86

self_limiters-0.1.2-cp38-abi3-musllinux_1_1_x86_64.whl (889.4 kB view details)

Uploaded CPython 3.8+ musllinux: musl 1.1+ x86-64

self_limiters-0.1.2-cp38-abi3-musllinux_1_1_aarch64.whl (843.3 kB view details)

Uploaded CPython 3.8+ musllinux: musl 1.1+ ARM64

self_limiters-0.1.2-cp38-abi3-manylinux_2_24_armv7l.whl (644.8 kB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.24+ ARMv7l

self_limiters-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (719.1 kB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

self_limiters-0.1.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (667.2 kB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

self_limiters-0.1.2-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl (784.3 kB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.12+ i686

self_limiters-0.1.2-cp38-abi3-macosx_11_0_arm64.whl (602.6 kB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

self_limiters-0.1.2-cp38-abi3-macosx_10_7_x86_64.whl (634.8 kB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file self_limiters-0.1.2-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 80fd0ab3df300284205811c5eb478843378dc77130f85506b8456c1c0c5c118e
MD5 c16f366b46bb71dfd709b7c2e84bd8a0
BLAKE2b-256 b9eb6db5abe549abc2ffc1d10d79c3026de89ec9856a74e10ee5246534a86e4e

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-win32.whl.

File metadata

  • Download URL: self_limiters-0.1.2-cp38-abi3-win32.whl
  • Upload date:
  • Size: 562.2 kB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 b72afa5d334b41954f1411e03d98751629c4bef7e70f058d647c90a18c2c5a62
MD5 84ea77fbcd37a4a33aa4737a45063525
BLAKE2b-256 d567c9b1d96e79e00d802b9fcde9236b229b559cf44afc3db14575f09d860948

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 1bdb09d8342f453359e591837c710d4fbbbee01a155110f118772a064612e7ec
MD5 13c7e5836c55f399c71d3c26a590a649
BLAKE2b-256 5eac3400050af030ef3bf931a2f6f6180b1f0440365677e6fbcb2be9d79c4da8

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 7ab1e9d44f5427e3b6cb3202b0d2b0699cee423b70ee920264752eb82f29fc1c
MD5 8463f43fe15333956d6dcb88614c84b0
BLAKE2b-256 207573e706326178e348a27e91c12ba9ed20005c1ada1b3d009afb80944a60f0

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-manylinux_2_24_armv7l.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-manylinux_2_24_armv7l.whl
Algorithm Hash digest
SHA256 bb0ec37f36a9a33f0ffa08d7f05367ad9ef3857b99657af08ae0da1a979a1585
MD5 e4e0509b4e820f079f3a964b180b4067
BLAKE2b-256 3c59ba4e64c257fc69bd84c27d3b8fbbe78b2e4b07d8dfb7356f10792f4bc625

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b28af904e747a36f99d95942dc3525b81a1a4b3e62c0a89c5fd36ae180367495
MD5 53be3751ba243477e81ebc3e44575abf
BLAKE2b-256 186d000a328483815e2ea45e4714207b68ed03367e476bfd9f239b506c3f73b6

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e516b564d3133ba32ef0516b948d0be7b3c6eece03e366159c8280e14ec8ef0e
MD5 e8799e87e49a2f2d9c2c66193ae714be
BLAKE2b-256 bfb1836a396b9bf415e6ef4b15141e841424afb13af9a047d5e5147d1c397c13

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 89fa995be00c34aee157abbf49db2e83e836500f101dd1ab013b6586463852bd
MD5 d55c987965687b916f7adb01d9e1c0da
BLAKE2b-256 4e089f5c3b9536cded970816ab2cd7512dd2aa7220ed7e4771d73adad1588718

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5af448c4e70c099e8ab21fb0b77534970e9123be4ba8269124ec6daded197df8
MD5 690fc286d4296acd30b5bce6efc29e12
BLAKE2b-256 20f75e2494f8bbef076970e8da3aed9f864cd7a4a7e70dd3490ce63e79396e8f

See more details on using hashes here.

File details

Details for the file self_limiters-0.1.2-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for self_limiters-0.1.2-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 6cf3d3812f431a3521c570905ec81c523826968bbc7621749d697903c89ef28d
MD5 c7aa5e42d732f7fb59eb802ca99c03dc
BLAKE2b-256 4a42e7d2ae54d2d9a605cb7be5405b9d75fab4c3e0709a242f9ab15bd4b2f4b2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page