Automatic failover between LLM providers. When OpenAI is down, seamlessly switch to Anthropic, Google, or any backup.
Project description
llm-fallback
Automatic failover between LLM providers. When OpenAI is down, seamlessly switch to Anthropic, Google, or any backup. Circuit breaker, retry logic, and latency-based routing built in.
The Pain
OpenAI goes down at 2 AM and your production chatbot returns 500s for 3 hours. Your users are furious. You could have fallen back to Claude or Gemini, but your code is hardwired to one provider.
Install
pip install llm-fallback
Quick Start
from llm_fallback import FallbackChain, Provider
chain = FallbackChain([
Provider("openai", model="gpt-4o", api_key="sk-..."),
Provider("anthropic", model="claude-3-5-sonnet-20241022", api_key="sk-ant-..."),
Provider("openai", model="gpt-3.5-turbo", api_key="sk-..."), # cheaper backup
])
# Tries providers in order until one succeeds
response = chain.chat("What is the capital of France?")
print(response.content) # "The capital of France is Paris."
print(response.provider) # "openai" (or whichever succeeded)
print(response.model) # "gpt-4o"
print(response.latency_ms) # 450
With Messages
response = chain.chat(
messages=[
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Explain quantum computing"},
],
temperature=0.7,
max_tokens=500,
)
Circuit Breaker
Automatically stops trying a provider that's failing:
chain = FallbackChain(
providers=[...],
circuit_breaker=True, # Enable circuit breaker
failure_threshold=3, # Open after 3 consecutive failures
recovery_timeout=60, # Try again after 60 seconds
timeout=30, # Per-request timeout
)
Latency-Based Routing
Route to the fastest provider instead of fixed order:
chain = FallbackChain(
providers=[...],
strategy="latency", # "ordered" (default) or "latency"
)
# Tracks response times and prefers the fastest healthy provider
Provider Configuration
Provider(
name="openai", # Provider identifier
model="gpt-4o", # Model name
api_key="sk-...", # API key
base_url=None, # Custom endpoint (for proxies/self-hosted)
timeout=30, # Request timeout in seconds
weight=1.0, # Priority weight for routing
)
Supported Providers
| Provider | Name string | Notes |
|---|---|---|
| OpenAI | "openai" |
GPT-4o, GPT-3.5, etc. |
| Anthropic | "anthropic" |
Claude 3/3.5/4 |
"google" |
Gemini (requires google-generativeai) |
|
| OpenAI-compatible | "openai" with base_url |
Ollama, vLLM, Together, Groq, etc. |
Callbacks
def on_failover(from_provider, to_provider, error):
print(f"Failing over from {from_provider} to {to_provider}: {error}")
slack_alert(f"LLM failover: {from_provider} -> {to_provider}")
chain = FallbackChain(
providers=[...],
on_failover=on_failover,
on_success=lambda r: metrics.record(r.provider, r.latency_ms),
)
Features
- Automatic failover — tries next provider on any error
- Circuit breaker — stops hammering a dead provider
- Latency routing — prefer the fastest healthy provider
- Retry with backoff — configurable retry per provider
- Timeout enforcement — per-request timeouts
- Unified API — same interface regardless of provider
- Callbacks — hook into failover events for alerting
- Zero required deps — only needs the provider SDK you're already using
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_fallback-0.1.0.tar.gz.
File metadata
- Download URL: llm_fallback-0.1.0.tar.gz
- Upload date:
- Size: 6.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d36cbea55bfe1529ba8986a47b1dc232ad9466f165574831de8f6c6a4c241e8f
|
|
| MD5 |
c044a6eb1060ab54ad9cd458a8e7734d
|
|
| BLAKE2b-256 |
317be72293d3f6e62dd327fe2b36592d5d003ff1a0153323c78a1b76b0a18326
|
File details
Details for the file llm_fallback-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llm_fallback-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b414e6e301b6b384fe4766d8e7ea6ae677de9178de3e7425a175401de9477824
|
|
| MD5 |
cad59ce857e8f0cc96a526fb7dbd160e
|
|
| BLAKE2b-256 |
e7885a2fa84d2930d5282a662ebedae8077cd78fd3235fc96a64cfa300279d3e
|