Skip to main content

Automatic failover between LLM providers. When OpenAI is down, seamlessly switch to Anthropic, Google, or any backup.

Project description

llm-fallback

Automatic failover between LLM providers. When OpenAI is down, seamlessly switch to Anthropic, Google, or any backup. Circuit breaker, retry logic, and latency-based routing built in.

The Pain

OpenAI goes down at 2 AM and your production chatbot returns 500s for 3 hours. Your users are furious. You could have fallen back to Claude or Gemini, but your code is hardwired to one provider.

Install

pip install llm-fallback

Quick Start

from llm_fallback import FallbackChain, Provider

chain = FallbackChain([
    Provider("openai", model="gpt-4o", api_key="sk-..."),
    Provider("anthropic", model="claude-3-5-sonnet-20241022", api_key="sk-ant-..."),
    Provider("openai", model="gpt-3.5-turbo", api_key="sk-..."),  # cheaper backup
])

# Tries providers in order until one succeeds
response = chain.chat("What is the capital of France?")
print(response.content)     # "The capital of France is Paris."
print(response.provider)    # "openai" (or whichever succeeded)
print(response.model)       # "gpt-4o"
print(response.latency_ms)  # 450

With Messages

response = chain.chat(
    messages=[
        {"role": "system", "content": "You are helpful."},
        {"role": "user", "content": "Explain quantum computing"},
    ],
    temperature=0.7,
    max_tokens=500,
)

Circuit Breaker

Automatically stops trying a provider that's failing:

chain = FallbackChain(
    providers=[...],
    circuit_breaker=True,       # Enable circuit breaker
    failure_threshold=3,        # Open after 3 consecutive failures
    recovery_timeout=60,        # Try again after 60 seconds
    timeout=30,                 # Per-request timeout
)

Latency-Based Routing

Route to the fastest provider instead of fixed order:

chain = FallbackChain(
    providers=[...],
    strategy="latency",  # "ordered" (default) or "latency"
)
# Tracks response times and prefers the fastest healthy provider

Provider Configuration

Provider(
    name="openai",          # Provider identifier
    model="gpt-4o",         # Model name
    api_key="sk-...",       # API key
    base_url=None,          # Custom endpoint (for proxies/self-hosted)
    timeout=30,             # Request timeout in seconds
    weight=1.0,             # Priority weight for routing
)

Supported Providers

Provider Name string Notes
OpenAI "openai" GPT-4o, GPT-3.5, etc.
Anthropic "anthropic" Claude 3/3.5/4
Google "google" Gemini (requires google-generativeai)
OpenAI-compatible "openai" with base_url Ollama, vLLM, Together, Groq, etc.

Callbacks

def on_failover(from_provider, to_provider, error):
    print(f"Failing over from {from_provider} to {to_provider}: {error}")
    slack_alert(f"LLM failover: {from_provider} -> {to_provider}")

chain = FallbackChain(
    providers=[...],
    on_failover=on_failover,
    on_success=lambda r: metrics.record(r.provider, r.latency_ms),
)

Features

  • Automatic failover — tries next provider on any error
  • Circuit breaker — stops hammering a dead provider
  • Latency routing — prefer the fastest healthy provider
  • Retry with backoff — configurable retry per provider
  • Timeout enforcement — per-request timeouts
  • Unified API — same interface regardless of provider
  • Callbacks — hook into failover events for alerting
  • Zero required deps — only needs the provider SDK you're already using

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_fallback-0.1.0.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_fallback-0.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file llm_fallback-0.1.0.tar.gz.

File metadata

  • Download URL: llm_fallback-0.1.0.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for llm_fallback-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d36cbea55bfe1529ba8986a47b1dc232ad9466f165574831de8f6c6a4c241e8f
MD5 c044a6eb1060ab54ad9cd458a8e7734d
BLAKE2b-256 317be72293d3f6e62dd327fe2b36592d5d003ff1a0153323c78a1b76b0a18326

See more details on using hashes here.

File details

Details for the file llm_fallback-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llm_fallback-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for llm_fallback-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b414e6e301b6b384fe4766d8e7ea6ae677de9178de3e7425a175401de9477824
MD5 cad59ce857e8f0cc96a526fb7dbd160e
BLAKE2b-256 e7885a2fa84d2930d5282a662ebedae8077cd78fd3235fc96a64cfa300279d3e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page