Primary/secondary failover wrapper for LangChain chat models, with tool-calling preserved across failover.
Project description
langchain-failover
A tiny, dependency-light primary/secondary failover wrapper for LangChain chat models. Point it at two chat models; it serves from the primary, transparently falls back to the secondary on connection errors, and switches back the moment the primary recovers — and tool-calling keeps working across the failover.
from langchain_openai import ChatOpenAI
from langchain_failover import FailoverChatModel
primary = ChatOpenAI(base_url="http://gpu-box:8001/v1", api_key="x", model="local")
backup = ChatOpenAI(base_url="http://cpu-box:8002/v1", api_key="x", model="local")
llm = FailoverChatModel(primary=primary, secondary=backup)
llm.invoke("Summarise this incident…") # served by primary
# …primary host dies…
llm.invoke("And the next one?") # transparently served by backup
# …primary comes back…
llm.invoke("One more") # back on primary, logged as recovered
Install
pip install langchain-failover # core
pip install "langchain-failover[openai]" # + langchain-openai for create_failover_llm
Why not RunnableWithFallbacks / .with_fallbacks()?
LangChain ships per-invocation fallbacks, and they're great for what they do. This package exists for the cases they don't cover well:
- Stateful recovery.
FailoverChatModelremembers which leg it's on and logs the transition both ways (activeproperty tells you)..with_fallbacks()is stateless — every call re-tries the (possibly still-dead) primary first. - Tool-calling survives failover.
bind_toolsis overridden to bind on both legs and return anotherFailoverChatModel. With strict langchain-core (>=1.4, whereBaseChatModel.bind_toolsraises by default) naïve wrappers break at bind time; agents using this one keep working. - Connection-aware, not blanket. It only fails over on connection/network
errors (walking the exception's
__cause__/__context__chain, so a socket error wrapped three layers deep still counts). AValueErrorfrom a bad prompt propagates instead of being silently retried on a second endpoint. - Mid-stream safety. During
stream(), it only fails over if the primary dies before the first token — so you never get duplicated, half-streamed output.
Local-model convenience
If you run local OpenAI-compatible servers (vLLM, mlx-lm, Ollama, LM Studio) and
don't want to hardcode model names, create_failover_llm auto-discovers the served
model id from each endpoint's /models:
from langchain_failover import create_failover_llm
llm = create_failover_llm(
primary_url="http://localhost:8001/v1",
secondary_url="http://localhost:8002/v1",
)
Bonus helper
extract_token_metrics(response.response_metadata) normalises token counts and
timings across OpenAI-compatible and Ollama metadata shapes into a single
{input_tokens, output_tokens, prompt_time, generation_time} dict.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_failover-0.1.0.tar.gz.
File metadata
- Download URL: langchain_failover-0.1.0.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08bdee324d334eda3f301f06e123d4459bdbb1d3d5b8d4b845b8863172e474e1
|
|
| MD5 |
a6a5185f3af437bc55b38582a3e24582
|
|
| BLAKE2b-256 |
e35e497e4b0bc98c89cfbbe760601742702ef69e2a2d8a9e8c9066dac334c936
|
File details
Details for the file langchain_failover-0.1.0-py3-none-any.whl.
File metadata
- Download URL: langchain_failover-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a62c0ee4d3e5ee9cdc07c3fa045c028faee1ed892efe9fc817c1dc110ece4271
|
|
| MD5 |
97638229449a7935487d80afeae407be
|
|
| BLAKE2b-256 |
4e3860b535dec23f87545ce5a58c2e9802e130362a5c376720bf1221b5c35513
|