Skip to main content

Primary/secondary failover wrapper for LangChain chat models, with tool-calling preserved across failover.

Project description

langchain-failover

CI PyPI Python License

A tiny, dependency-light primary/secondary failover wrapper for LangChain chat models. Point it at two chat models; it serves from the primary, transparently falls back to the secondary on connection errors, and switches back the moment the primary recovers — and tool-calling keeps working across the failover.

from langchain_openai import ChatOpenAI
from langchain_failover import FailoverChatModel

primary = ChatOpenAI(base_url="http://gpu-box:8001/v1", api_key="x", model="local")
backup  = ChatOpenAI(base_url="http://cpu-box:8002/v1", api_key="x", model="local")

llm = FailoverChatModel(primary=primary, secondary=backup)

llm.invoke("Summarise this incident…")   # served by primary
# …primary host dies…
llm.invoke("And the next one?")           # transparently served by backup
# …primary comes back…
llm.invoke("One more")                     # back on primary, logged as recovered

Install

pip install langchain-failover            # core
pip install "langchain-failover[openai]"  # + langchain-openai for create_failover_llm

Why not RunnableWithFallbacks / .with_fallbacks()?

LangChain ships per-invocation fallbacks, and they're great for what they do. This package exists for the cases they don't cover well:

  • Stateful recovery. FailoverChatModel remembers which leg it's on and logs the transition both ways (active property tells you). .with_fallbacks() is stateless — every call re-tries the (possibly still-dead) primary first.
  • Tool-calling survives failover. bind_tools is overridden to bind on both legs and return another FailoverChatModel. With strict langchain-core (>=1.4, where BaseChatModel.bind_tools raises by default) naïve wrappers break at bind time; agents using this one keep working.
  • Connection-aware, not blanket. It only fails over on connection/network errors (walking the exception's __cause__/__context__ chain, so a socket error wrapped three layers deep still counts). A ValueError from a bad prompt propagates instead of being silently retried on a second endpoint.
  • Mid-stream safety. During stream(), it only fails over if the primary dies before the first token — so you never get duplicated, half-streamed output.

Local-model convenience

If you run local OpenAI-compatible servers (vLLM, mlx-lm, Ollama, LM Studio) and don't want to hardcode model names, create_failover_llm auto-discovers the served model id from each endpoint's /models:

from langchain_failover import create_failover_llm

llm = create_failover_llm(
    primary_url="http://localhost:8001/v1",
    secondary_url="http://localhost:8002/v1",
)

Bonus helper

extract_token_metrics(response.response_metadata) normalises token counts and timings across OpenAI-compatible and Ollama metadata shapes into a single {input_tokens, output_tokens, prompt_time, generation_time} dict.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_failover-0.1.1.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_failover-0.1.1-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file langchain_failover-0.1.1.tar.gz.

File metadata

  • Download URL: langchain_failover-0.1.1.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for langchain_failover-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f21fcd936f431da2b09a786632992d897c4f83067438c206eae808d31a11a038
MD5 1613dec0c28272d7a69a96938e5bc4d7
BLAKE2b-256 8f88f3ccbe50aeda1bfb09054f6bb5a12573ca99679b4321f6a93a56313db5eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_failover-0.1.1.tar.gz:

Publisher: release.yml on vinayvobbili/langchain-failover

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langchain_failover-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_failover-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4246f78f760319407c07df3ce88114ce8fd1e7828c3ed2e336531ad21474050a
MD5 19777df5f43f91c6872b76e0f325a50d
BLAKE2b-256 fb243752524f18fb18c5fcb3ac67afcf6e62f9808098d2422246b0076847f9d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_failover-0.1.1-py3-none-any.whl:

Publisher: release.yml on vinayvobbili/langchain-failover

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page