Skip to main content

rotates models to avoid hitting rate limit

Project description

Model Rotator

Model Rotator is a Python library for managing multiple LLM (Large Language Model) instances with rate limits and priorities. It dynamically schedules requests to models based on their rate limits, usage, and priority levels, ensuring optimal utilization of available resources.

Features

  • Rate-Limit Management: Automatically tracks and enforces rate limits per model.
  • Priority-Based Scheduling: Prioritizes high-priority models over medium and low-priority ones.
  • Dynamic Updates: Tracks model usage in real-time and prunes stale usage data.
  • Stateful: Maintains state for each model's usage across calls.
  • Customizable: Easily configure models with different rate limits and priorities.

Installation

Install the package from PyPI:

pip install model-rotator

Usage

  1. Define Your Models Provide a list of model configurations:

Note: Models listed first will have higher priority if models have same priority given.

from model_rotator import ModelRotator

models = [
    {"name": "groq/llama-3.1-70b-versatile", "priority": "high", "limit": 30},
    {"name": "groq/llama-3.1-70b-specdec", "priority": "high", "limit": 30},
    {"name": "groq/llama-3.1-8b-instant", "priority": "medium", "limit": 30},
    {"name": "groq/llama-3.2-1b-preview", "priority": "low", "limit": 30},
    {"name": "gemini/gemini-1.5-flash", "priority": "medium", "limit": 30},
    {"name": "gemini/gemini-1.5-pro", "priority": "high", "limit": 15},
    {"name": "gemini/gemini-exp-1114", "priority": "high", "limit": 2},
]
  1. Initialize the Scheduler
rotator = ModelRotator(models)
  1. Schedule Requests Use get_next_model() to get the next available model for processing:
for _ in range(50):  # Simulate 50 requests
    model = rotator.get_next_model()
    if model:
        print(f"Using model: {model}")
    else:
        print("All models exhausted, retry later.")
  1. Check Model States Inspect the current state of all models:
print(rotator.get_state())

Example Output

Copy code
Request 1: Using model: groq/llama-3.1-70b-versatile
Request 2: Using model: groq/llama-3.1-70b-specdec
...
Request 50: All models exhausted, retry later.

Model States:
[
    {"name": "groq/llama-3.1-70b-versatile", "priority": "high", "limit": 30, "current_usage": 30},
    {"name": "groq/llama-3.1-70b-specdec", "priority": "high", "limit": 30, "current_usage": 30},
    ...
]

API

ModelRotator(models:Model) Initializes the scheduler.

  • models: A list of dictionaries. Each dictionary must include:
    • name (str): The model name.
    • priority (str): Priority level (high, medium, low).
    • limit (int): Maximum allowed requests per minute.

get_next_model() Returns the name of the next available model based on priority and rate limits.

  • Returns:
    • str: The model name, or
    • None if no models are available.

get_state() Returns the current state of all models, including their usage.

  • Returns:
    • list: A list of dictionaries with the following fields:
      • name (str): Model name.
      • priority (str): Priority level.
      • limit (int): Rate limit.
      • current_usage (int): Current number of requests within the last minute.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

model_rotator-0.1.0.tar.gz (3.6 kB view details)

Uploaded Source

Built Distribution

model_rotator-0.1.0-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file model_rotator-0.1.0.tar.gz.

File metadata

  • Download URL: model_rotator-0.1.0.tar.gz
  • Upload date:
  • Size: 3.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Linux/6.8.0-49-generic

File hashes

Hashes for model_rotator-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2ecbeac787f80b26bd666fb44e7cd599f71d7ea63c030c4303f0b842260b1367
MD5 873f93a00aebad45d86292de86e1f2fb
BLAKE2b-256 33c006dd47a0d37e6f1197f6857669bfaf36bcf4699ea835aac20ba91eae6b31

See more details on using hashes here.

File details

Details for the file model_rotator-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: model_rotator-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Linux/6.8.0-49-generic

File hashes

Hashes for model_rotator-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 15cd1186dcb65b01ecc33fd0c322c6cc5f04de0bd5edc203e1963c04c91ca9c4
MD5 6b0b35e903a1f98995794afacfe4a1df
BLAKE2b-256 f72b2b586b3d1471e4b32437e39baaec9a4303b2afd7975b704dba8988a07123

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page