Skip to main content

The OpenAI Load Balancer is a Python library designed to distribute API requests across multiple endpoints (supports both OpenAI and Azure OpenAI). It implements a round-robin mechanism for load balancing and includes exponential backoff for retrying failed requests.

Project description

OpenAI Load Balancer

Description

The OpenAI Load Balancer is a Python library designed to distribute API requests across multiple endpoints (supports both OpenAI and Azure). It implements a round-robin mechanism for load balancing and includes exponential backoff for retrying failed requests.

Supported OpenAI functions: ChatCompletion, Embedding, Completion (will be deprecated soon)

Features

  • Round Robin Load Balancing: Distributes requests evenly across a set of API endpoints.
  • Exponential Backoff: Includes retry logic with exponential backoff for each API call.
  • Failure Detection: Temporarily removes failed endpoints based on configurable thresholds.
  • Flexible Configuration: Customizable settings for endpoints, failure thresholds, cooldown periods, and more.
  • Easy Integration: Designed to be easily integrated into projects that use OpenAI's API.
  • Fallback: If OpenAI's endpoint goes down, if your Azure endpoint is still up, then your service stays up, and vice versa.

Installation

To install the OpenAI Load Balancer, run the following command:

pip install openai-load-balancer

Usage

First, setup your OpenAI API Endpoints. The API keys and base_url will be read from your env variable.

# Example configuration

ENDPOINTS = [
    {
        "api_type": "azure",
        "base_url": "AZURE_API_BASE_URL_1",
        "api_key_env": "AZURE_API_KEY_1",
        "version": "2023-05-15"
    },
    {
        "api_type": "open_ai",
        "base_url": "https://api.openai.com/v1",
        "api_key_env": "OPENAI_API_KEY_1",
        "version": None
    }
    # Add more configurations as needed
]

If you are using both Azure and OpenAI, specify the mapping of the OpenAI's model name to your Azure's engine name

MODEL_ENGINE_MAPPING = {
    "gpt-4": "gpt4",
    "gpt-3.5-turbo": "gpt-35-turbo",
    "text-embedding-ada-002": "text-embedding-ada-002"
    # Add more mappings as needed
}

Import and initialize the load balancer with the endpoints and mapping:

from openai_load_balancer import initialize_load_balancer

openai_load_balancer = initialize_load_balancer(
    endpoints=ENDPOINTS, model_engine_mapping=MODEL_ENGINE_MAPPING)

Making API Calls

Simply replace openai with openai_load_balancer in your function calls!:

response = openai_load_balancer.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello! This is a request."}
    ],
    # Additional parameters...
)

Additional configurations

You can also configure the load balancer with the following variables

# The number of consecutive failures of a request to an endpoint before the endpoint is temporarily marked as inactive
FAILURE_THRESHOLD = 5
# The minimum amount of time an endpoint is marked as inactive before it is reset to active.
COOLDOWN_PERIOD = timedelta(minutes=10)
# Whether or not to enable load balancing. If disabled, the first active endpoint will always be used, and other endpoints will only be used in case the first one fails.
LOAD_BALANCING_ENABLED = True

Contributing

Contributions to the OpenAI Load Balancer are welcome!

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openai-load-balancer-0.1.1.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openai_load_balancer-0.1.1-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file openai-load-balancer-0.1.1.tar.gz.

File metadata

  • Download URL: openai-load-balancer-0.1.1.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for openai-load-balancer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7175818bcd50581b35e1aaac9321efbd27a810c4a0bda0083f8484fdbe3d06da
MD5 17c1bd206b8e484c1945c97c40620094
BLAKE2b-256 b89868b98a05545fd4128a3b56f36915a3c4ea2eb2e518a707c68ca6cc4f9af4

See more details on using hashes here.

File details

Details for the file openai_load_balancer-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for openai_load_balancer-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5bf29699f63cc84852df866208e406587cec9857ee8aab41a8055a20af71ad74
MD5 e83c9796cac715e206389a568ce22810
BLAKE2b-256 a69ee66439534bf11ce787805d281c50eebae358f5b5ddc101c830133998eba9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page