Automatic failover between LLM providers. When OpenAI is down, seamlessly switch to Anthropic, Google, or any backup.

These details have not been verified by PyPI

Project links

Homepage

Project description

llm-fallback

Automatic failover between LLM providers. When OpenAI is down, seamlessly switch to Anthropic, Google, or any backup. Circuit breaker, retry logic, and latency-based routing built in.

The Pain

OpenAI goes down at 2 AM and your production chatbot returns 500s for 3 hours. Your users are furious. You could have fallen back to Claude or Gemini, but your code is hardwired to one provider.

Install

pip install llm-fallback

Quick Start

from llm_fallback import FallbackChain, Provider

chain = FallbackChain([
    Provider("openai", model="gpt-4o", api_key="sk-..."),
    Provider("anthropic", model="claude-3-5-sonnet-20241022", api_key="sk-ant-..."),
    Provider("openai", model="gpt-3.5-turbo", api_key="sk-..."),  # cheaper backup
])

# Tries providers in order until one succeeds
response = chain.chat("What is the capital of France?")
print(response.content)     # "The capital of France is Paris."
print(response.provider)    # "openai" (or whichever succeeded)
print(response.model)       # "gpt-4o"
print(response.latency_ms)  # 450

With Messages

response = chain.chat(
    messages=[
        {"role": "system", "content": "You are helpful."},
        {"role": "user", "content": "Explain quantum computing"},
    ],
    temperature=0.7,
    max_tokens=500,
)

Circuit Breaker

Automatically stops trying a provider that's failing:

chain = FallbackChain(
    providers=[...],
    circuit_breaker=True,       # Enable circuit breaker
    failure_threshold=3,        # Open after 3 consecutive failures
    recovery_timeout=60,        # Try again after 60 seconds
    timeout=30,                 # Per-request timeout
)

Latency-Based Routing

Route to the fastest provider instead of fixed order:

chain = FallbackChain(
    providers=[...],
    strategy="latency",  # "ordered" (default) or "latency"
)
# Tracks response times and prefers the fastest healthy provider

Provider Configuration

Provider(
    name="openai",          # Provider identifier
    model="gpt-4o",         # Model name
    api_key="sk-...",       # API key
    base_url=None,          # Custom endpoint (for proxies/self-hosted)
    timeout=30,             # Request timeout in seconds
    weight=1.0,             # Priority weight for routing
)

Supported Providers

Provider	Name string	Notes
OpenAI	`"openai"`	GPT-4o, GPT-3.5, etc.
Anthropic	`"anthropic"`	Claude 3/3.5/4
Google	`"google"`	Gemini (requires `google-generativeai`)
OpenAI-compatible	`"openai"` with `base_url`	Ollama, vLLM, Together, Groq, etc.

Callbacks

def on_failover(from_provider, to_provider, error):
    print(f"Failing over from {from_provider} to {to_provider}: {error}")
    slack_alert(f"LLM failover: {from_provider} -> {to_provider}")

chain = FallbackChain(
    providers=[...],
    on_failover=on_failover,
    on_success=lambda r: metrics.record(r.provider, r.latency_ms),
)

Features

Automatic failover — tries next provider on any error
Circuit breaker — stops hammering a dead provider
Latency routing — prefer the fastest healthy provider
Retry with backoff — configurable retry per provider
Timeout enforcement — per-request timeouts
Unified API — same interface regardless of provider
Callbacks — hook into failover events for alerting
Zero required deps — only needs the provider SDK you're already using

License

MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Apr 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_fallback-0.1.0.tar.gz (6.4 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_fallback-0.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file llm_fallback-0.1.0.tar.gz.

File metadata

Download URL: llm_fallback-0.1.0.tar.gz
Upload date: Apr 10, 2026
Size: 6.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for llm_fallback-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`d36cbea55bfe1529ba8986a47b1dc232ad9466f165574831de8f6c6a4c241e8f`
MD5	`c044a6eb1060ab54ad9cd458a8e7734d`
BLAKE2b-256	`317be72293d3f6e62dd327fe2b36592d5d003ff1a0153323c78a1b76b0a18326`

See more details on using hashes here.

File details

Details for the file llm_fallback-0.1.0-py3-none-any.whl.

File metadata

Download URL: llm_fallback-0.1.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 7.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for llm_fallback-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b414e6e301b6b384fe4766d8e7ea6ae677de9178de3e7425a175401de9477824`
MD5	`cad59ce857e8f0cc96a526fb7dbd160e`
BLAKE2b-256	`e7885a2fa84d2930d5282a662ebedae8077cd78fd3235fc96a64cfa300279d3e`

See more details on using hashes here.

llm-fallback 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-fallback

The Pain

Install

Quick Start

With Messages

Circuit Breaker

Latency-Based Routing

Provider Configuration

Supported Providers

Callbacks

Features

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes