Skip to main content

Primary/secondary failover wrapper for LangChain chat models, with tool-calling preserved across failover.

Project description

langchain-failover

A tiny, dependency-light primary/secondary failover wrapper for LangChain chat models. Point it at two chat models; it serves from the primary, transparently falls back to the secondary on connection errors, and switches back the moment the primary recovers — and tool-calling keeps working across the failover.

from langchain_openai import ChatOpenAI
from langchain_failover import FailoverChatModel

primary = ChatOpenAI(base_url="http://gpu-box:8001/v1", api_key="x", model="local")
backup  = ChatOpenAI(base_url="http://cpu-box:8002/v1", api_key="x", model="local")

llm = FailoverChatModel(primary=primary, secondary=backup)

llm.invoke("Summarise this incident…")   # served by primary
# …primary host dies…
llm.invoke("And the next one?")           # transparently served by backup
# …primary comes back…
llm.invoke("One more")                     # back on primary, logged as recovered

Install

pip install langchain-failover            # core
pip install "langchain-failover[openai]"  # + langchain-openai for create_failover_llm

Why not RunnableWithFallbacks / .with_fallbacks()?

LangChain ships per-invocation fallbacks, and they're great for what they do. This package exists for the cases they don't cover well:

  • Stateful recovery. FailoverChatModel remembers which leg it's on and logs the transition both ways (active property tells you). .with_fallbacks() is stateless — every call re-tries the (possibly still-dead) primary first.
  • Tool-calling survives failover. bind_tools is overridden to bind on both legs and return another FailoverChatModel. With strict langchain-core (>=1.4, where BaseChatModel.bind_tools raises by default) naïve wrappers break at bind time; agents using this one keep working.
  • Connection-aware, not blanket. It only fails over on connection/network errors (walking the exception's __cause__/__context__ chain, so a socket error wrapped three layers deep still counts). A ValueError from a bad prompt propagates instead of being silently retried on a second endpoint.
  • Mid-stream safety. During stream(), it only fails over if the primary dies before the first token — so you never get duplicated, half-streamed output.

Local-model convenience

If you run local OpenAI-compatible servers (vLLM, mlx-lm, Ollama, LM Studio) and don't want to hardcode model names, create_failover_llm auto-discovers the served model id from each endpoint's /models:

from langchain_failover import create_failover_llm

llm = create_failover_llm(
    primary_url="http://localhost:8001/v1",
    secondary_url="http://localhost:8002/v1",
)

Bonus helper

extract_token_metrics(response.response_metadata) normalises token counts and timings across OpenAI-compatible and Ollama metadata shapes into a single {input_tokens, output_tokens, prompt_time, generation_time} dict.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_failover-0.1.0.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_failover-0.1.0-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file langchain_failover-0.1.0.tar.gz.

File metadata

  • Download URL: langchain_failover-0.1.0.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for langchain_failover-0.1.0.tar.gz
Algorithm Hash digest
SHA256 08bdee324d334eda3f301f06e123d4459bdbb1d3d5b8d4b845b8863172e474e1
MD5 a6a5185f3af437bc55b38582a3e24582
BLAKE2b-256 e35e497e4b0bc98c89cfbbe760601742702ef69e2a2d8a9e8c9066dac334c936

See more details on using hashes here.

File details

Details for the file langchain_failover-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_failover-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a62c0ee4d3e5ee9cdc07c3fa045c028faee1ed892efe9fc817c1dc110ece4271
MD5 97638229449a7935487d80afeae407be
BLAKE2b-256 4e3860b535dec23f87545ce5a58c2e9802e130362a5c376720bf1221b5c35513

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page