LLM ratelimiter for Python.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

rateLLMiter

rateLLMiter smooths out requests

rateLLMiter is a Python package that smoothes out requests to LLM APIs to get faster, more consistent performance. If a LLM client generates too many rate limit exceptions, a LLM server is likely to throttle the client. rateLLMiter prevents throttling by:

Limiting the number of requests per second to requests per minute divided by 60.

Ramps up requests over several seconds whenever there is a sudden increase in requests. This prevents rate limit exceptions.

After a rate limit exception, rateLLMiter periodically tests the LLM server to see if it is accepting requests again. When it is accepting requests, rateLLMiter releases the requests that had rate limit exceptions first.

Installation

Setup Virtual Environment

I recommend setting up a virtual environment to isolate Python dependencies.

python3 -m venv .venv
source .venv/bin/activate

Install Package

Install the package from PyPi - this takes awhile because it also installs the python clients of multiple LLMs:

pip install ratellmiter-ai

Startup and Shutdown

rateLLMiter has a monitor thread and logging that needs to be started and stopped.

        get_rate_limiter_monitor().start()
        get_rate_limiter_monitor().stop()

Calling stop before exiting stops the monitor thread ,and it writes out logs that can be used to create graphs of the rate limiting. By default, logs are written to the "ratellmiter_logs" subdirectory of the current working directory. The default rate list is 300 requests per minute. You can change these by setting their values in the start method.

        get_rate_limiter_monitor().start(default_rate_limit=300, log_directory="ratellmiter_logs")

The easy way to use rateLLMiter

The easiest way to use rateLLMiter is to use the ratellmiter decorator. If you are only using one LLM client, you only need to use the decorator. You can change the default rate limit by setting the default_rate_limit parameter on startup.

        @ratellmiter
        def get_response(prompt):
            # Your code here

Generating graphs

It can be helpful to see what rateLLMiter is doing. You can generate graphs of the rate limiting by running the following command in your venv:

ratellmiter -model=? -file=? -lines=?

-model: The name of the model you want to graph. If not specified, it will use "default".

-file: The source file for the data to graphs. By default, it uses the most recent log file with data.

-lines: The lines to draw on the graph. For example "ri" will draw the requests and issued tickets lines. The other options are e=rate limit exceptions, f=finished requests, o=overflow requests

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.23

Jan 30, 2025

0.1.22

Oct 13, 2024

0.1.21

Oct 13, 2024

0.1.20

Oct 13, 2024

0.1.19

Oct 9, 2024

0.1.18

Oct 9, 2024

0.1.17

Oct 7, 2024

0.1.16

Sep 16, 2024

0.1.15

Sep 16, 2024

0.1.14

Sep 16, 2024

This version

0.1.8

Sep 15, 2024

0.1.7

Sep 15, 2024

0.1.6

Sep 15, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ratellmiter_ai-0.1.8.tar.gz (12.7 kB view details)

Uploaded Sep 15, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rateLLMiter_ai-0.1.8-py3-none-any.whl (14.8 kB view details)

Uploaded Sep 15, 2024 Python 3

File details

Details for the file ratellmiter_ai-0.1.8.tar.gz.

File metadata

Download URL: ratellmiter_ai-0.1.8.tar.gz
Upload date: Sep 15, 2024
Size: 12.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for ratellmiter_ai-0.1.8.tar.gz
Algorithm	Hash digest
SHA256	`8419c6c97ba8c168e9c412213a4c713ab6b09622ba5764644c217ea4fa5e5718`
MD5	`cbeb76c57dda12288e03e574962d50e7`
BLAKE2b-256	`9a2ac3e7dcf2788e71d28d22b90b2210685ba24d9e8aed34d1eea0a6953547cf`

See more details on using hashes here.

File details

Details for the file rateLLMiter_ai-0.1.8-py3-none-any.whl.

File metadata

Download URL: rateLLMiter_ai-0.1.8-py3-none-any.whl
Upload date: Sep 15, 2024
Size: 14.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for rateLLMiter_ai-0.1.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`86b03050d40864b5992c85e1393dddc6001c13ab458c5ca8d114b364663294a9`
MD5	`e0662dbe7b62a67d2b5ea0fb23044c3f`
BLAKE2b-256	`52856b016f37c3f16125a391ff8fe292834562ee1816200db5740f0d4917b7a6`

See more details on using hashes here.

rateLLMiter-ai 0.1.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

rateLLMiter

Installation

Setup Virtual Environment

Install Package

Startup and Shutdown

The easy way to use rateLLMiter

Generating graphs

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes