Skip to main content

A wrapper engine to enforce request-based, token-based, and concurrency-based limits on kani engines.

Project description

kani-ratelimits

This is a simple, small package to to enforce request-per-minute (RPM), token-per-minute (TPM), and/or max-concurrency ratelimits before making requests to an underlying engine.

Installation

pip install kani-ratelimits

Usage

from kani.ext.ratelimits import RatelimitedEngine

# limit requests to 10 req/min and 30k tokens/min
inner_engine = ...  # your engine here, e.g. `OpenAIEngine(..., model="gpt-4")`
engine = RatelimitedEngine(inner_engine, rpm_limit=10, tpm_limit=30_000)

The RatelimitedEngine takes the following parameters:

  • engine: The engine to wrap.
  • max_concurrency (int): The maximum number of concurrent requests to serve at once (default unlimited).
  • rpm_limit (float): The maximum number of requests to serve per rpm_period (default unlimited).
  • rpm_period (float): The duration, in seconds, of the time period in which to limit the rate. Note that up to rpm_limit requests are allowed within this time period in a burst (default 60s).
  • tpm_limit (float): The maximum number of tokens to send in requests per tpm_period (default unlimited).
  • tpm_period (float): The duration, in seconds, of the time period in which to limit the rate. Note that up to tpm_limit tokens are allowed within this time period in a burst (default 60s).

The ratelimiter will ensure that all conditions are met before forwarding the request to the wrapped engine.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kani_ratelimits-1.0.1.tar.gz (5.1 kB view hashes)

Uploaded Source

Built Distribution

kani_ratelimits-1.0.1-py3-none-any.whl (4.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page