Production-grade cross-vendor failover for LLM APIs
Project description
robust-llm-chain
๐ฐ๐ท ํ๊ตญ์ด ๋ฌธ์: README_KO.md ยท ARCHITECTURE_KO.md ยท CONTRIBUTING_KO.md ยท SECURITY_KO.md ยท CODE_OF_CONDUCT_KO.md. ์๋ณธ (English) ์ด ์ ๋ณธ.
Production-grade cross-vendor failover for LLM APIs. When your provider hits 529 / pending / throttle, automatically retry on the next vendor โ same request, sub-second detection, worker-coordinated round-robin.
robust-llm-chain is a small, focused Python library that adds cross-vendor failover to LLM API calls. It implements LangChain's Runnable interface, so it drops into existing chains, while exposing a richer acall() API for operational metadata (attempts, cost, usage).
It does one thing well: when Anthropic Direct returns 529 or stalls before the first token, the library transparently re-issues the same request to OpenRouter (or any other configured provider) โ within seconds, not minutes.
Why this exists
Two pains that off-the-shelf libraries address only partially:
1. Anthropic 529 / Overloaded
Anthropic Direct periodically returns 529 Overloaded during demand spikes. A single retry against the same endpoint usually fails the same way. The right fix is cross-vendor failover โ Claude is also reachable through Bedrock and OpenRouter โ but most LLM client libraries only retry against the same provider.
2. Streaming "pending" provider
A provider can accept your request, hold the connection open, and never send the first token. With a 60-second total timeout, you wait the full minute before failing. With a 30-second timeout, you misclassify slow-but-real responses as failures.
robust-llm-chain separates the two:
first_token_timeout(default 15s) โ if no token arrives in this window, give up on this provider and try the next one. Fallback happens before the user notices a delay.per_provider_timeout(default 60s) โ total response budget, applied after the first token has streamed.total_timeoutโ wall-clock cap across all attempts.
These two timeouts are the core differentiator: most libraries only have a single overall timeout, so a pending provider burns 30โ60 seconds before fallback even starts.
Quickstart
Install:
pip install "robust-llm-chain[anthropic,openrouter]"
Set two environment variables (ANTHROPIC_API_KEY, OPENROUTER_API_KEY), then:
import asyncio
import os
from robust_llm_chain import RobustChain
chain = (
RobustChain.builder()
.add_provider(
type="anthropic",
model="claude-haiku-4-5-20251001",
api_key=os.environ["ANTHROPIC_API_KEY"],
priority=0, # preferred fallback target
)
.add_provider(
type="openrouter",
model="anthropic/claude-haiku-4.5",
api_key=os.environ["OPENROUTER_API_KEY"],
priority=1, # lower fallback preference; still RR-selected on alternate calls
)
.build()
)
# acall: convenience method that returns a ChainResult with operational metadata
result = asyncio.run(chain.acall("๋ ์ค๋ก ์๊ธฐ์๊ฐ ํด์ค."))
print(result.output.content) # BaseMessage.content
print(f"used: {result.provider_used.id} | tokens: {result.usage}") # metadata
The standard Runnable
ainvoke()returns just aBaseMessage(for LangChain composition). To getattempts,cost, andusagein one call, useacall()or readchain.last_result.
What happens:
- Two providers configured via the fluent builder: Anthropic Direct and OpenRouter โ both active failover paths. Round-robin distributes the first attempt of each call across them (call 1 starts on Anthropic, call 2 on OpenRouter, โฆ). Priority decides the fallback order after the first provider fails โ
priority=0(Anthropic) is tried beforepriority=1(OpenRouter). See Provider configuration for the two-role table. - Credentials are passed as values (
api_key=...). Where the value comes from โ env var, secrets manager, Vault โ is your call. The builder never readsos.environon your behalf, so the source is explicit at the call site. - If the first-attempt provider returns 529 / overloaded / pending, the request transparently fails over to the next provider in the priority-ordered fallback sequence (lowest
priorityfirst, regardless of which provider was attempted first). No additional configuration. - Missing env var โ
os.environ["..."]raisesKeyErrorwith the exact var name (Python's standard fail-fast).
Defaults: single-worker / pricing=None / backend=LocalBackend(). For multi-worker round-robin, cost computation, or multi-key / multi-region patterns, see Provider configuration and Advanced usage below.
Three configuration paths are available โ
from_env(env-driven dict, single-per-type),builder(fluent, multi-key OK, fail-fast โ used here), and explicitproviders=[ProviderSpec(...)]list. See the comparison matrix in Provider configuration.
Anatomy of a result
acall() returns ChainResult โ eight fields with everything you need to log, audit, and observe a call:
| Field | Type | What it carries |
|---|---|---|
output |
BaseMessage |
The model's response (output.content is the text) |
input |
list[BaseMessage] |
The normalized prompt actually sent (after ChatPromptTemplate rendering) |
usage |
TokenUsage |
input_tokens / output_tokens / cache_read_tokens / cache_write_tokens / total_tokens |
cost |
CostEstimate | None |
USD per category โ None when no PricingSpec is attached (cost tracking is opt-in) |
provider_used |
ProviderSpec |
The provider that actually returned the response (the last attempt). Credentials are masked in repr |
model_used |
ModelSpec |
The model spec of the successful provider |
attempts |
list[AttemptRecord] |
Every provider attempt โ successful and failed โ in order. See below |
elapsed_ms |
float |
End-to-end wall clock time |
Happy path โ single provider succeeds
result = await chain.acall("๋ ์ค๋ก ์๊ธฐ์๊ฐ ํด์ค.")
result.output.content # โ "์๋
ํ์ธ์. ์ ๋ Claude ์
๋๋ค. ๋ ์ค๋ก ์๊ธฐ์๊ฐ ํด ๋๋ฆด๊ฒ์."
result.usage # โ TokenUsage(input_tokens=18, output_tokens=27, total_tokens=45, ...)
result.cost # โ None (no PricingSpec attached)
result.provider_used.id # โ "anthropic-direct"
result.provider_used.type # โ "anthropic"
result.model_used.model_id # โ "claude-haiku-4-5-20251001"
result.elapsed_ms # โ 845.2
result.attempts # โ [
# AttemptRecord(provider_id="anthropic-direct",
# phase="model_creation", elapsed_ms=12,
# error_type=None, fallback_eligible=False, ...),
# AttemptRecord(provider_id="anthropic-direct",
# phase="first_token", elapsed_ms=320,
# error_type=None, fallback_eligible=False, ...),
# ]
Failover path โ primary throttles, fallback succeeds
result = await chain.acall("...")
result.output.content # โ response from OpenRouter
result.provider_used.id # โ "openrouter-claude" (the one that succeeded)
result.attempts # โ [
# AttemptRecord(provider_id="anthropic-direct",
# phase="first_token", elapsed_ms=412,
# error_type="OverloadedError",
# error_message="529: Overloaded",
# fallback_eligible=True, ...),
# AttemptRecord(provider_id="openrouter-claude",
# phase="model_creation", elapsed_ms=8,
# error_type=None, fallback_eligible=False, ...),
# AttemptRecord(provider_id="openrouter-claude",
# phase="first_token", elapsed_ms=290,
# error_type=None, fallback_eligible=False, ...),
# ]
AttemptRecord.error_message is already sanitized via _security.sanitize_message โ provider key prefixes are masked and the string is truncated to 200 chars. Safe to log directly.
chain.last_result (contextvars-scoped) and aggregates
| Property | What it carries |
|---|---|
chain.last_result |
The most recent ChainResult for this asyncio task only (contextvars-isolated, so concurrent asyncio.gather(chain.acall(...), chain.acall(...)) calls don't see each other's results) |
chain.total_token_usage |
Cumulative TokenUsage across every successful call on this RobustChain instance (lock-protected) |
chain.total_cost |
Cumulative CostEstimate across every successful call (None until first call with pricing) |
The standard Runnable ainvoke() returns just a BaseMessage. To inspect attempts / cost / usage after ainvoke or astream, read chain.last_result.
Logging
The library emits structured WARN/ERROR-only logs through Python's standard logging module. There is no DEBUG/INFO chatter, and prompt or response text is never logged โ that is the application's responsibility (see SECURITY.md hardening #3).
Logger names
| Logger | Source | When it fires |
|---|---|---|
robust_llm_chain.chain |
RobustChain instance + from_env |
provider build failures, fallback attempts, unknown provider type warnings |
robust_llm_chain.observability.langsmith |
cleanup_run |
LangSmith outage (timeout / generic exception), backpressure drops |
Both honor whatever handler / formatter / level you configure on the root logger or these specific names. To silence one, logging.getLogger("robust_llm_chain.chain").setLevel(logging.ERROR) etc.
Structured fields (the extra payload)
Every WARN/ERROR record carries extra fields you can route in JSON formatters or aggregators (Datadog, Splunk, Loki, โฆ):
| Event | Fields |
|---|---|
langsmith_cleanup_timeout |
run_id |
langsmith_cleanup_fail |
run_id, error_type |
langsmith_cleanup_drop |
max_inflight |
Custom logger inject: RobustChain(providers=..., logger=my_logger) โ wire your own logger if you want a per-chain stream.
What is NOT logged (by design)
- Prompt text (
input) and response text (output.content) โ application'sChainResult.input/ChainResult.outputto persist if needed - API keys / AWS credentials โ
ProviderSpec.__repr__masks them;AttemptRecord.error_messageis sanitized via_security.sanitize_messagebefore being stored - Per-attempt success debug info โ only WARN on failure / fallback events. Production-grade, low-cardinality
Installation & Extras
What gets pulled in by default:
langchain-core>=0.3(transitive โ providesRunnable/BaseChatModel/BaseMessage/PromptValue/ChatPromptTemplate). The umbrellalangchainpackage is intentionally NOT a dependency โ this library uses only the core abstractions, keeping the dependency footprint minimal. Provider SDKs (langchain-anthropic/langchain-openai/langchain-aws) and backends (aiomcache) are opt-in extras below.
| Command | What's included |
|---|---|
pip install robust-llm-chain |
Core only โ langchain-core auto-pulled. No provider adapters, so from_env() raises NoProvidersConfigured until you add at least one extra |
pip install "robust-llm-chain[anthropic]" |
+ langchain-anthropic (Anthropic Direct) |
pip install "robust-llm-chain[openrouter]" |
+ langchain-openai (OpenRouter โ OpenAI-compatible API) |
pip install "robust-llm-chain[openai]" |
+ langchain-openai (OpenAI Direct) |
pip install "robust-llm-chain[bedrock]" |
+ langchain-aws (AWS Bedrock โ Claude / Llama / Nova / etc.) |
pip install "robust-llm-chain[memcached]" |
+ aiomcache (async client for worker-coordinated round-robin) |
pip install "robust-llm-chain[anthropic,openrouter,bedrock,memcached]" |
Recommended production combo (3-way Claude failover) |
pip install "robust-llm-chain[all]" |
Every adapter and backend currently shipped |
A
redisbackend extra is planned for a future release โ not yet shippable, so the extra is intentionally absent from the list above.
The library does not depend on python-dotenv. Loading .env files is up to your application.
Provider configuration โ three paths
There are three ways to tell RobustChain which providers to use. They differ in what they can express and how concise the call site is:
| Capability | RobustChain.from_env(model_ids={...}) |
RobustChain.builder().add_provider(...).add_bedrock(...).build() |
RobustChain(providers=[ProviderSpec(...)]) |
|---|---|---|---|
| Source of credentials | env vars (auto-read, dict key = type) | values passed via api_key= (read from anywhere โ env, vault, secrets manager) |
values passed via api_key= |
| Source of model_id | dict value | model="..." keyword arg |
ModelSpec(model_id=...) field |
| One provider per type | โ | โ | โ |
Multiple keys for the same type (e.g. anthropic-1 + anthropic-2 for rate-limit headroom) |
โ โ dict key is unique | โ
โ call add_provider(type="anthropic", ...) twice with distinct api_key= / id= |
โ
โ same type, distinct id |
| Multi-region (Bedrock east + west) | โ โ single AWS_REGION env |
โ
โ explicit region= per add_bedrock(...) |
โ
โ explicit per-spec region |
| Different model_ids on the same type | โ โ dict key is unique | โ
โ different model= per call |
โ
โ different model.model_id per spec |
Per-spec priority ordering |
โ โ uniform default 0 |
โ
โ priority= keyword |
โ โ explicit ordering primaryโfallback |
| Missing API_KEY behavior | silent skip โ that provider is dropped, others still build | depends on caller โ os.environ["..."] raises KeyError, vault libs raise their own errors |
n/a (you supplied the key explicitly) |
| Mental model | 12-factor / env-driven | fluent, credentials-as-values | code-as-config |
| Use when | Dev, single-vendor-per-type production, env-driven deploys | Most production use cases โ multi-key / multi-region / cross-vendor with credentials sourced from anywhere | When you already have ProviderSpec instances from elsewhere (config loader, orchestrator, etc.) |
Quick decision tree
- "Just want one Claude + one OpenAI from env vars, simplest possible" โ
from_env. Done. - "Need multi-key / multi-region / cross-vendor / explicit priority" โ
RobustChain.builder()(recommended for most production). Seeexamples/builder.py. - "Already constructing
ProviderSpecinstances elsewhere in code (config loader, orchestrator)" โ explicitproviders=[ProviderSpec(...)]list. See the inline code in Advanced usage below.
Two-role traffic model (v0.4.0+):
Role What it controls When it kicks in Round-robin Which provider this call attempts first (over user-listed order) Call start, every call Priority Order of fallback attempts after the first provider fails (lower wins) Only when first attempt fails
priority=lower value wins (DNS MX / cron / Linuxniceconvention); ties preserve user-listed order. Example with[A(p=0), B(p=1), C(p=2)]: call 1 =AโBโC, call 2 =BโAโC, call 3 =CโAโB. RR distributes initial-attempt load; priority decides who picks up after a failure.
Recognized environment variables (for from_env)
| Variable | Provider | Active | Notes |
|---|---|---|---|
ANTHROPIC_API_KEY |
anthropic | โ | Anthropic Direct |
OPENROUTER_API_KEY |
openrouter | โ | OpenRouter (any vendor's model) |
OPENAI_API_KEY |
openai | โ | OpenAI Direct (gpt-*, o1-*, etc.) |
AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY + AWS_REGION |
bedrock | โ | All three required; missing any one โ provider skipped |
Default Behavior
| Setting | Default | Meaning |
|---|---|---|
backend |
LocalBackend() (asyncio.Lock) |
Single-worker safe round-robin |
per_provider_timeout |
60s |
Total response budget per provider |
first_token_timeout |
15s |
Fallback if first chunk doesn't arrive in this window |
total_timeout |
per_provider ร N + 60s buffer, capped at 360s |
Wall-clock cap across all attempts |
stream_cleanup_timeout |
2s |
aclose() budget when falling back during streaming |
temperature |
0.1 |
Per-call override available |
max_output_tokens |
ModelSpec.max_output_tokens or 4096 |
Per-call override available |
pricing |
None โ result.cost = None |
Cost computation skipped without pricing |
| Logger name | "robust_llm_chain" |
Hierarchical (e.g. robust_llm_chain.stream) |
| Logger level | WARNING |
Set to INFO/DEBUG for fallback diagnostics |
| Type hints | py.typed marker shipped |
mypy/pyright recognize types out of the box |
chain.invoke() (sync) |
not implemented | Wrap with asyncio.run() |
Philosophy: zero environment variables, zero external files required. RobustChain(...) runs immediately.
Three things that make this different
-
Streaming first-token timeout for pending detection. Most libraries only have an overall timeout. A pending provider burns the full window before fallback. This library measures the first chunk arrival separately (default 15s) and falls over the moment that budget elapses.
-
Worker-coordinated round-robin. (Memcached today; pluggable
IndexBackendfor Redis or your own) In a multi-worker deployment (gunicorn ร 8, etc.), most OSS libraries hold the round-robin index per process. With 8 workers that means 8 simultaneous requests can land on the same provider. This library shares the index through a backend (Memcached or your own implementation ofIndexBackend) so the load actually spreads. -
Cross-vendor (and cross-model) failover. Same prompt, multiple paths. Active providers: Anthropic Direct + OpenRouter + OpenAI Direct + AWS Bedrock. Common patterns:
- Same-model 3-way failover for Claude โ Anthropic Direct โ Bedrock (us-east-1) โ OpenRouter
- Cross-region within Bedrock โ
id="bedrock-east"(us-east-1) โid="bedrock-west"(us-west-2) - Cross-vendor cross-model โ Claude on Anthropic โ GPT on OpenAI when "we just need some answer"
- Multi-key per vendor โ
id="anthropic-primary"โid="anthropic-backup"for tenant isolation or rate-limit headroom
Who is this for
- Long-running multi-worker Python services (FastAPI + gunicorn, Django, Celery)
- Teams running Claude across multiple paths (Anthropic Direct + Bedrock + OpenRouter), or mixing Claude + GPT for survivability
- Anyone who has actually been paged at 3am because of
529 Overloadedor stalled streams - Existing LangChain
Runnableusers โ drop-in compatible
Not for: serverless / Edge runtimes, single-provider stacks, multimodal-first workloads.
Compared to other libraries
| Library | What it does | What this library adds on top |
|---|---|---|
| litellm | Comprehensive multi-provider router with weighted / cost-based routing | Narrower scope: cross-vendor failover, first-token timeout, worker-coordinated round-robin |
LangChain Runnable.with_fallbacks |
Sequential exception-based fallback inside one Runnable | Adds first-token timeout (sub-second pending detection) + inter-worker round-robin via shared backend |
| Vercel AI SDK | TypeScript/edge-first SDK with streaming UX | This is async Python for long-running multi-worker servers โ different runtime target |
For most users the answer is "use both": this library handles the cross-vendor failover layer, while litellm handles broader routing if you have it. They compose โ robust-llm-chain is a single Runnable you can plug anywhere.
Advanced usage
Runnable examples: all four patterns below โ multi-key, 3-way Claude failover, cross-vendor (Claude โ GPT), Bedrock multi-region โ are runnable scripts in
examples/builder.py(usingRobustChain.builder()). Try withuv run python examples/builder.py multikey(or3way/xvendor/multiregion). The inline code blocks below show the same patterns expressed via explicitproviders=[ProviderSpec(...)]for use cases where you already have spec instances from a config loader.
Multi-worker production (Memcached-coordinated round-robin)
import aiomcache
from robust_llm_chain import RobustChain
from robust_llm_chain.backends import MemcachedBackend
memcached = aiomcache.Client("memcached.internal", 11211)
chain = RobustChain.from_env(
model_ids={
"anthropic": "claude-haiku-4-5-20251001",
"openrouter": "anthropic/claude-haiku-4.5",
},
backend=MemcachedBackend(client=memcached, key_prefix="myapp:rr"),
)
Memcached failure semantics: fail-closed. If Memcached is unreachable, the library raises
BackendUnavailablerather than silently falling back to a local index. The whole point of the worker-coordinated round-robin is consistency across workers; an automatic fallback would silently break that. Catch the error in your app and decide explicitly (healthcheck-then-rebuild-chain pattern recommended).
Explicit ProviderSpec (when env-based config isn't enough)
import os
from robust_llm_chain import RobustChain, ProviderSpec, ModelSpec, PricingSpec, TimeoutConfig
chain = RobustChain(
providers=[
ProviderSpec(
id="anthropic-direct",
type="anthropic",
api_key=os.environ["ANTHROPIC_API_KEY"],
model=ModelSpec(
model_id="claude-haiku-4-5-20251001",
pricing=PricingSpec(input_per_1m=0.80, output_per_1m=4.00),
max_output_tokens=8192,
),
),
ProviderSpec(
id="openrouter",
type="openrouter",
api_key=os.environ["OPENROUTER_API_KEY"],
model=ModelSpec(
model_id="anthropic/claude-haiku-4.5",
pricing=PricingSpec(input_per_1m=1.00, output_per_1m=5.00),
),
),
],
timeouts=TimeoutConfig(per_provider=60.0, first_token=15.0),
)
Multiple keys per vendor
import os
from robust_llm_chain import RobustChain, ProviderSpec, ModelSpec
# Two Anthropic API keys โ round-robin between them, fall over if one rate-limits.
# Same shape works for any single-key provider (OPENAI_API_KEY_1 / _2, etc.).
# Naming is your call (_1/_2, _PRIMARY/_BACKUP, _TEAM_A/_TEAM_B, โฆ).
chain = RobustChain(providers=[
ProviderSpec(
id="anthropic-1",
type="anthropic",
api_key=os.environ["ANTHROPIC_API_KEY_1"],
model=ModelSpec(model_id="claude-haiku-4-5-20251001"),
),
ProviderSpec(
id="anthropic-2",
type="anthropic",
api_key=os.environ["ANTHROPIC_API_KEY_2"],
model=ModelSpec(model_id="claude-haiku-4-5-20251001"),
),
])
Bedrock cross-region failover (us-east-1 โ us-west-2)
import os
from robust_llm_chain import RobustChain, ProviderSpec, ModelSpec
chain = RobustChain(providers=[
ProviderSpec(
id="bedrock-east",
type="bedrock",
aws_access_key_id=os.environ["AWS_ACCESS_KEY_ID"],
aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
region="us-east-1",
model=ModelSpec(model_id="anthropic.claude-haiku-4-5-20251001-v1:0"),
),
ProviderSpec(
id="bedrock-west",
type="bedrock",
aws_access_key_id=os.environ["AWS_ACCESS_KEY_ID"],
aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
region="us-west-2",
model=ModelSpec(model_id="anthropic.claude-haiku-4-5-20251001-v1:0"),
),
])
Cross-vendor same-model: 3-way Claude (Anthropic + Bedrock + OpenRouter)
chain = RobustChain.from_env(model_ids={
"anthropic": "claude-haiku-4-5-20251001",
"bedrock": "anthropic.claude-haiku-4-5-20251001-v1:0",
"openrouter": "anthropic/claude-haiku-4.5",
})
# Round-robin between three paths to Claude. If Anthropic 529s, fall to
# Bedrock or OpenRouter automatically.
Cross-vendor cross-model: Claude โ GPT
chain = RobustChain.from_env(model_ids={
"anthropic": "claude-haiku-4-5-20251001",
"openai": "gpt-4o-mini",
})
# When "we just need some answer" matters more than "exactly the same model".
Streaming
async for chunk in chain.astream("Tell me a joke."):
print(chunk.content, end="", flush=True)
# After completion, metadata is available
print(chain.last_result.attempts, chain.last_result.cost)
Error handling
from robust_llm_chain.errors import (
AllProvidersFailed, ProviderTimeout, FallbackNotApplicable, BackendUnavailable,
ProviderInactive, ProviderModelCreationFailed,
)
try:
result = await chain.acall("...")
except BackendUnavailable as e:
# Memcached down โ switch to LocalBackend explicitly or fail the request
log.error("backend unavailable", extra={"error": str(e)})
except ProviderInactive:
# Adapter extras not installed (e.g. `pip install robust-llm-chain[anthropic]`
# missing) โ environment problem, not a transient error. fail-fast.
raise
except FallbackNotApplicable:
# Auth error or parser failure โ no point retrying
raise
except AllProvidersFailed as e:
for attempt in e.attempts:
log.error("provider failed", extra={"provider": attempt.provider_id, "error": attempt.error_type})
except ProviderTimeout as e:
log.error(f"total timeout in phase={e.phase}")
Adapter build errors (
ProviderModelCreationFailed, v0.4.1+): any raw SDK / config exception raised byadapter.build()(e.g.ValueError("model id wrong"),botocore.errorfactory.ValidationException) is wrapped intoProviderModelCreationFailedso external callers see a single typed contract instead of vendor-specific exceptions. The original raw exception is preserved on__cause__. Wrapped errors are fallback-eligible โ multi-provider fault tolerance treats one vendor's config error as "try the next one". A persistently-broken provider therefore fails silently as long as another succeeds; monitorChainResult.attemptsforphase == "model_creation"to detect chronic config drift. All providers failing surfaces asAllProvidersFailed.
Architecture
Module structure, dependency graph, call lifecycle (acall / ainvoke / astream), error flow, and extension points (custom ProviderAdapter / IndexBackend) are documented in ARCHITECTURE.md. Read that before opening a PR or wiring a custom adapter.
Status
v0.4.x in pre-1.0 active development. CI matrix: Python 3.11 / 3.12 / 3.13. Public API may break before 1.0; all changes are documented in CHANGELOG.md (v0.3 and v0.4 each shipped a BREAKING failover-semantic change โ see migration notes there).
As-Is โ no support guarantee. Provided under MIT license; no SLA, no issue-response timeline, no feature-request commitment. Bugs are fixed when convenient. If something doesn't work for your use case โ fork it. PRs welcome but not depended on. This is a personal project optimized for the maintainer's own dogfooding.
โ ๏ธ Upgrading from v0.3.x? v0.4.0 splits round-robin and priority into two distinct roles: RR picks the first provider this call attempts (over user-listed order); priority orders the fallback sequence after that first provider fails. v0.3 used a single priority-sorted rotation, so fallback order shifted every call. v0.4 makes fallback always honor priority. Attempt sequences differ from v0.3 whenever your user-listed order does not match priority-sorted order (and even when it does, fallback order changes for any call where the first provider fails). The only no-op case is
n=1(one provider). See CHANGELOG[0.4.0]for the migration table. Verify your traffic and fallback ordering before upgrading, regardless of N.โ ๏ธ Upgrading from v0.2.x? v0.3.0 flipped
priority=semantic to lower-value-wins (DNS MX / cron convention) AND consolidated 4 typedadd_*builder methods toadd_provider(type=โฆ)+add_bedrock(...). If you copy-pasted v0.2 README'spriority=0(labeled primary) โ your traffic was hitting fallback first. v0.3 makes it actually go to primary. Verify your traffic distribution before/after upgrade. Full migration in CHANGELOG.md[0.3.0].
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robust_llm_chain-0.4.1.tar.gz.
File metadata
- Download URL: robust_llm_chain-0.4.1.tar.gz
- Upload date:
- Size: 102.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14c0391d5e188c6659bba10cf8763fcf1fdb5dcb378b90de4221f252655b4fcb
|
|
| MD5 |
bd1cffa244384e2cf49597ba761cccff
|
|
| BLAKE2b-256 |
d4cd9bc3859b9438e3f6dc1b3aea7d5fa0ae00908a422a1d287455d16ed8fb03
|
File details
Details for the file robust_llm_chain-0.4.1-py3-none-any.whl.
File metadata
- Download URL: robust_llm_chain-0.4.1-py3-none-any.whl
- Upload date:
- Size: 47.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0216c1beecbd59dde1cb08007d6ca8839768e8405f5a7fb88142bb5d37f7ec59
|
|
| MD5 |
e1074a04de9a5bd4baaa63d37351ce39
|
|
| BLAKE2b-256 |
9dacc6d86fe1e1926cca099940846b7cdc0729601b4f133f8483983cf3db75a7
|