Skip to main content

Open/closed/half-open circuit breaker keyed by any string. Zero dependencies, async-friendly, observable.

Project description

keel-circuit-breaker

Open/closed/half-open circuit breaker keyed by any string. Zero dependencies, async-friendly, observable. Skip a failing target during a cooldown, probe for recovery after.

Part of Keel — a portfolio of small, vendor-neutral libraries. This is the first one, extracted from 1.5+ years of production use across 5 distinct LLM provider adapters.

Why it exists

Every product that calls a flaky external service (LLM providers, third-party APIs, internal microservices) re-implements the same pattern: after N consecutive failures, stop hammering the failing target for a while; after a cooldown, let a request through to see if it recovered. keel-circuit-breaker is that pattern, battle-tested and keyed by any string — so one instance serves per-model, per-tenant, per-endpoint, or per-anything use cases.

Install

pip install keel-circuit-breaker     # or: uv add keel-circuit-breaker

Zero runtime dependencies (stdlib only).

Three worked examples

1. Guard an HTTP/LLM call (manual lifecycle)

from keel_circuit_breaker import CircuitBreaker

breaker = CircuitBreaker(failure_threshold=3, cooldown_seconds=120.0)

async def call_model(model_key: str, prompt: str) -> str:
    if not breaker.is_available(model_key):
        raise RuntimeError(f"{model_key} circuit open — skip it")
    try:
        result = await some_async_http_call(prompt)
        breaker.record_success(model_key)
        return result
    except Exception:
        breaker.record_failure(model_key)   # YOU decide this is a failure
        raise

2. The call() convenience wrapper

from keel_circuit_breaker import CircuitBreaker, CircuitOpenError

breaker = CircuitBreaker()

try:
    result = await breaker.call(some_async_http_call, key="model-x", prompt="hi")
except CircuitOpenError as e:
    ...  # e.key tells you which key is open; fall back to another target

3. Multi-tenant rate-isolation (keyed by tenant, not model)

breaker = CircuitBreaker(failure_threshold=5, cooldown_seconds=60.0)

# The key is any string — here, a tenant id. One tenant tripping the breaker
# does not affect another tenant's availability.
if breaker.is_available(tenant_id):
    ...

API

CircuitBreaker(
    failure_threshold: int = 3,             # consecutive failures before opening
    cooldown_seconds: float = 120.0,        # time open before probing (half-open)
    logger: StructuredLogger | None = None, # default: logging.getLogger("keel_circuit_breaker")
)

breaker.is_available(key) -> bool           # should this target be called?
breaker.record_success(key) -> None         # fully resets the failure count
breaker.record_failure(key) -> None         # increments; opens at threshold
breaker.get_status(key) -> "closed" | "open" | "half_open"
await breaker.call(fn, key, *args, **kwargs) # convenience wrapper; raises CircuitOpenError

StructuredLogger is a small protocol — info(event, **fields) / warning(event, **fields). structlog's BoundLogger satisfies it directly (the original production logger). The default is a stdlib-logging adapter that routes fields into extra=.

Design notes (read before changing behavior)

These are deliberate, load-bearing decisions carried over from production. They look like small choices; they aren't.

  1. Monotonic clock, never wall-clock. Cooldowns use time.monotonic(). Wall-clock time (time.time()) can move backward under NTP adjustments, DST transitions, or an operator running date — which would re-open a circuit prematurely or stretch a cooldown into next week. Don't "simplify" to time.time().

  2. Permissive half-open. Once the cooldown elapses, all concurrent is_available() callers see True until one calls record_success. This is not classic single-shot half-open (which lets exactly one probe through). It's intentional: when the caller bounds concurrency by other means (e.g., per-key rate limiting), a small burst of probes gives faster recovery. If you need single-shot half-open, wrap it or open an issue — a future probe_concurrency option may add it as a non-breaking choice.

  3. Single-event-loop only. State is a plain dict, not lock-guarded. Designed for one event loop per process (the common async deployment). Concurrent mutation from multiple threads or event loops in the same process will corrupt state. Multi-thread safety is out of scope for 0.x.

  4. Success fully resets the failure count (no decay). One success forgives all accumulated failures, even at threshold-minus-one. Intentional: pick a healthy target back up immediately. A "decay over time" model would keep skipping a recovered target after transient errors.

  5. The state dict is unbounded by key. Not a leak when the key space is bounded and application-controlled (a fixed set of models/tenants). If you pass user-supplied keys, this is a memory-growth / DoS vector — bound the key space yourself. LRU eviction may arrive at 1.0 if a real consumer needs it.

  6. The caller decides what counts as a failure. The breaker never inspects responses or status codes. You call record_failure() explicitly. This is deliberate: an HTTP 429 (rate-limited) usually should not open the circuit; a 500, a timeout, or malformed output usually should. That policy varies per provider and belongs to you, not the breaker. Don't wrap this in "smart" auto-detection.

  7. The defaults are tuned, not arbitrary. failure_threshold=3 skips flaky targets quickly while absorbing 1–2 transient errors. cooldown_seconds=120.0 matches typical provider rate-limit windows. They're calibrated for free-tier LLM provider behavior — override them for your own SLAs.

  8. Log field names are a contract. Each state transition emits a structured record with both legacy field names (model_key, failures, cooldown_seconds; events circuit_opened / circuit_closed / circuit_half_open) and canonical namespaced fields (keel.lib.name, keel.primitive, keel.event, keel.key, keel.failure_count, keel.cooldown_seconds). Dual emission is preserved throughout the entire 0.x lifecycle; the legacy aliases are dropped at 1.0.0. Update any log-based dashboards to the keel.* fields before then.

Status

0.1.0 — first release. Keel stays in 0.x through its first year (breaking changes possible at minor bumps, always documented in the CHANGELOG; pin exact versions). Source lives in the Keel monorepo.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keel_circuit_breaker-0.1.0.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keel_circuit_breaker-0.1.0-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file keel_circuit_breaker-0.1.0.tar.gz.

File metadata

  • Download URL: keel_circuit_breaker-0.1.0.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for keel_circuit_breaker-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f0e88705a3863b5424597c85ec73a69997f71d83a47738e04fcd2aea00713d34
MD5 b045d0a75e267269d0f111c11742ad52
BLAKE2b-256 2d1cc872fe46c210df89d12c52e4dce8592507b081e4fd85270679fc45a2b566

See more details on using hashes here.

Provenance

The following attestation bundles were made for keel_circuit_breaker-0.1.0.tar.gz:

Publisher: publish-py.yml on keelplatform/keel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file keel_circuit_breaker-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for keel_circuit_breaker-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bf2842357d16a9c6afa33935b6a73d218b95e5345f0c10c7d57820b0f8dec5a8
MD5 4dd00140255c81805ece592ba44c8eba
BLAKE2b-256 b076666b0d3f17213c94708827d3a8c65b3dba5848f62bcd4c989e5188fef6c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for keel_circuit_breaker-0.1.0-py3-none-any.whl:

Publisher: publish-py.yml on keelplatform/keel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page