Composed reliability for multi-model LLM calls — quorum fan-out + primary/failover, category-dispatched, transparent degradation. Built on keel-llm-protocol + keel-circuit-breaker.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ryakkali

These details have not been verified by PyPI

Project description

keel-llm-reliability

Production-grade reliability for multi-model LLM calls — quorum fan-out and primary/failover, category-dispatched, with transparent degradation. The composed solution, not a parts bin.

Part of the Keel toolkit. Composes keel-llm-protocol (the error taxonomy) + keel-circuit-breaker into the consumer-side machinery that acts on typed errors — so you don't hand-write the fan-out/failover loop.

Why it exists

A typed error taxonomy tells you what failed; this tells your app what to do about it. The core lesson (measured in production): a rate-limited model is healthy, not failing — defer it, don't trip its circuit. Acting on that one distinction moved a throttled model from 3/10 to 10/10 availability. This package generalizes it into two strategies and makes every decision visible.

Is this for you?

Adopt when — multi-model apps that need failover or ensemble; production traffic where rate-limit handling matters; operators who'll read the visible-degradation trail. Skip when — single-provider apps (the SDK + a basic retry is enough); you already have working in-tree reliability; prototypes / scripts; you need a runtime/framework rather than called helpers.

Deciding test: does this deliver a capability your codebase genuinely lacks, or could you get the same outcome with the SDK + a library you already trust? If the latter, skip.

Install

# this package + at least one adapter for the providers you call:
pip install keel-llm-reliability keel-llm-adapter-openai
#   keel-llm-adapter-anthropic / keel-llm-adapter-google also available;
#   reliability itself pulls in keel-llm-protocol + keel-circuit-breaker.

Quickstart (copy-paste runnable)

import asyncio
from keel_llm_reliability import ResilientClient, Request
from keel_llm_adapter_openai import OpenAIAdapter
from keel_llm_protocol import user

# Adapters are plain objects implementing keel-llm-protocol. Any OpenAI-compatible
# endpoint works (OpenAI, Groq, OpenRouter, Mistral, vLLM, Ollama, …); mix providers freely.
primary  = OpenAIAdapter(model="llama-3.3-70b-versatile", api_key="gsk_…",
                         base_url="https://api.groq.com/openai/v1", provider="groq")
fallback = OpenAIAdapter(model="llama-3.1-8b", base_url="http://localhost:11434/v1",
                         provider="local")

client = ResilientClient([primary, fallback])     # ordered: primary, then fallbacks

async def main() -> None:
    result = await client.failover(Request(messages=[user("One-line summary of TCP.")]))
    if result.succeeded:
        print(result.response.text)
    for a in result.attempts:                     # every decision is visible data
        print(a.model_key, a.outcome, f"{a.latency_ms}ms")

asyncio.run(main())

Two strategies

from keel_llm_reliability import ResilientClient, Request
from keel_llm_protocol import user

client = ResilientClient([primary, fallback])      # adapters built as above
req = Request(messages=[user("Summarize this in one line.")])

# Primary + ordered failover — the single-good-answer case (most apps):
result = await client.failover(req)
if result.succeeded:
    print(result.response.text)

# Quorum / parallel fan-out — the ensemble case:
result = await client.fan_out(req)
for r in result.successes:        # every model that answered
    ...

Both are also available as plain functions (fan_out, failover) if you'd rather wire collaborators yourself.

Transparent degradation — every decision is data

No silent retries, no hidden fallbacks. Every provider interaction is a visible Attempt:

result = await client.failover(req)
for a in result.attempts:
    print(a.model_key, a.outcome, a.latency_ms, a.error and a.error.category)
# groq:…     deferred_backpressure  120   backpressure   (throttled — skipped, NOT failed)
# gemini:…   failed                 310   transient      (5xx — counted, failed over)
# openai:…   success                420   None

outcome is one of success / preempted_open / preempted_limited / deferred_backpressure / failed. A failed attempt carries its error.category (transient vs terminal) so you can tell "flaky" from "broken config." Degradation you can see and operate on — not a black box.

How it behaves (category-dispatched)

Error category	fan_out (quorum)	failover
`backpressure` (429)	defer — contributes nothing this round; no breaker failure	route to the next candidate immediately; no breaker failure
`transient` (5xx, timeout)	record a breaker failure; that model contributes nothing	record a breaker failure; fail over (optionally retry the same model up to `transient_retries`)
`terminal` (auth/bad-request/context/content)	visible `failed`; no breaker failure (request-level, not model health)	visible `failed`; fail over

Before any dispatch, both strategies preempt: a model whose breaker is open (preempted_open) or whose limiter predicts it's full (preempted_limited) is skipped — predict, don't block. There are no hidden sleeps; exhaustion returns visibly (empty successes / response=None).

Injected collaborators — born ready for scale

The Breaker and (optional) Limiter are injected async protocols, never owned:

from keel_llm_reliability import InProcessBreaker, ResilientClient

# Default: zero-config in-process breaker (wraps keel-circuit-breaker).
client = ResilientClient(adapters)                       # InProcessBreaker()

# At scale: swap in a Redis-backed breaker/limiter (same protocol) for cross-worker
# state — the orchestrator code doesn't change.
client = ResilientClient(adapters, breaker=my_redis_breaker, limiter=my_redis_limiter)

The protocols are async precisely so a Redis-backed implementation (which does network I/O) can satisfy them — the in-process default just returns immediately.

Status

0.1.2 — quorum semantics are grounded in a production multi-model deployment; failover serves the single-answer broad base. 0.x while the API stabilizes through year one (breaking changes possible at minor bumps, documented in the CHANGELOG; pin exact versions).

The Keel toolkit

Composable, vendor-neutral LLM reliability libraries on PyPI: keel-llm-reliability · keel-llm-protocol · keel-llm-adapter-openai · keel-llm-adapter-anthropic · keel-llm-adapter-google · keel-circuit-breaker

MIT licensed.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ryakkali

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.2

May 23, 2026

0.1.1

May 22, 2026

0.1.0

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keel_llm_reliability-0.1.2.tar.gz (12.3 kB view details)

Uploaded May 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

keel_llm_reliability-0.1.2-py3-none-any.whl (11.0 kB view details)

Uploaded May 23, 2026 Python 3

File details

Details for the file keel_llm_reliability-0.1.2.tar.gz.

File metadata

Download URL: keel_llm_reliability-0.1.2.tar.gz
Upload date: May 23, 2026
Size: 12.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for keel_llm_reliability-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`f949d852b696a867e67659c9ccaee33958d2e64726977104437f05e4fcbf9267`
MD5	`ad92d57b21059ce63ccc38ed036391c0`
BLAKE2b-256	`81fd022160155aaba82e459e565a29ee45498c0b4cf35009e37925b3c265a560`

See more details on using hashes here.

Provenance

The following attestation bundles were made for keel_llm_reliability-0.1.2.tar.gz:

Publisher: publish-py.yml on keelplatform/keel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: keel_llm_reliability-0.1.2.tar.gz
- Subject digest: f949d852b696a867e67659c9ccaee33958d2e64726977104437f05e4fcbf9267
- Sigstore transparency entry: 1610076351
- Sigstore integration time: May 23, 2026
Source repository:
- Permalink: keelplatform/keel@2d54c903ad991fb68652520570a6571c69c477fb
- Branch / Tag: refs/tags/py-llm-reliability-v0.1.2
- Owner: https://github.com/keelplatform
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-py.yml@2d54c903ad991fb68652520570a6571c69c477fb
- Trigger Event: push

File details

Details for the file keel_llm_reliability-0.1.2-py3-none-any.whl.

File metadata

Download URL: keel_llm_reliability-0.1.2-py3-none-any.whl
Upload date: May 23, 2026
Size: 11.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for keel_llm_reliability-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`85197cc70d06e87015a858c09edf7f1bee316a2c6dd1ff21820663ab1123337a`
MD5	`2258a9e8313e403b2c346f3accb944ec`
BLAKE2b-256	`c4e5ca99b890102023667f86fa5a127adaea6ffaa92bff691e771011ecd47163`

See more details on using hashes here.

Provenance

The following attestation bundles were made for keel_llm_reliability-0.1.2-py3-none-any.whl:

Publisher: publish-py.yml on keelplatform/keel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: keel_llm_reliability-0.1.2-py3-none-any.whl
- Subject digest: 85197cc70d06e87015a858c09edf7f1bee316a2c6dd1ff21820663ab1123337a
- Sigstore transparency entry: 1610076654
- Sigstore integration time: May 23, 2026
Source repository:
- Permalink: keelplatform/keel@2d54c903ad991fb68652520570a6571c69c477fb
- Branch / Tag: refs/tags/py-llm-reliability-v0.1.2
- Owner: https://github.com/keelplatform
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-py.yml@2d54c903ad991fb68652520570a6571c69c477fb
- Trigger Event: push

keel-llm-reliability 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

keel-llm-reliability

Why it exists

Is this for you?

Install

Quickstart (copy-paste runnable)

Two strategies

Transparent degradation — every decision is data

How it behaves (category-dispatched)

Injected collaborators — born ready for scale

Status

The Keel toolkit

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance