Open/closed/half-open circuit breaker keyed by any string. Zero dependencies, async-friendly, observable.
Project description
keel-circuit-breaker
Open/closed/half-open circuit breaker keyed by any string. Zero dependencies, async-friendly, observable. Skip a failing target during a cooldown, probe for recovery after.
Part of Keel — a portfolio of small, vendor-neutral libraries. This is the first one, extracted from 1.5+ years of production use across 5 distinct LLM provider adapters.
Why it exists
Every product that calls a flaky external service (LLM providers, third-party APIs, internal microservices) re-implements the same pattern: after N consecutive failures, stop hammering the failing target for a while; after a cooldown, let a request through to see if it recovered. keel-circuit-breaker is that pattern, battle-tested and keyed by any string — so one instance serves per-model, per-tenant, per-endpoint, or per-anything use cases.
Install
pip install keel-circuit-breaker # or: uv add keel-circuit-breaker
Zero runtime dependencies (stdlib only).
Three worked examples
1. Guard an HTTP/LLM call (manual lifecycle)
from keel_circuit_breaker import CircuitBreaker
breaker = CircuitBreaker(failure_threshold=3, cooldown_seconds=120.0)
async def call_model(model_key: str, prompt: str) -> str:
if not breaker.is_available(model_key):
raise RuntimeError(f"{model_key} circuit open — skip it")
try:
result = await some_async_http_call(prompt)
breaker.record_success(model_key)
return result
except Exception:
breaker.record_failure(model_key) # YOU decide this is a failure
raise
2. The call() convenience wrapper
from keel_circuit_breaker import CircuitBreaker, CircuitOpenError
breaker = CircuitBreaker()
try:
result = await breaker.call(some_async_http_call, key="model-x", prompt="hi")
except CircuitOpenError as e:
... # e.key tells you which key is open; fall back to another target
3. Multi-tenant rate-isolation (keyed by tenant, not model)
breaker = CircuitBreaker(failure_threshold=5, cooldown_seconds=60.0)
# The key is any string — here, a tenant id. One tenant tripping the breaker
# does not affect another tenant's availability.
if breaker.is_available(tenant_id):
...
API
CircuitBreaker(
failure_threshold: int = 3, # consecutive failures before opening
cooldown_seconds: float = 120.0, # time open before probing (half-open)
logger: StructuredLogger | None = None, # default: logging.getLogger("keel_circuit_breaker")
)
breaker.is_available(key) -> bool # should this target be called?
breaker.record_success(key) -> None # fully resets the failure count
breaker.record_failure(key) -> None # increments; opens at threshold
breaker.get_status(key) -> "closed" | "open" | "half_open"
await breaker.call(fn, key, *args, **kwargs) # convenience wrapper; raises CircuitOpenError
StructuredLogger is a small protocol — info(event, **fields) / warning(event, **fields). structlog's BoundLogger satisfies it directly (the original production logger). The default is a stdlib-logging adapter that routes fields into extra=.
Design notes (read before changing behavior)
These are deliberate, load-bearing decisions carried over from production. They look like small choices; they aren't.
-
Monotonic clock, never wall-clock. Cooldowns use
time.monotonic(). Wall-clock time (time.time()) can move backward under NTP adjustments, DST transitions, or an operator runningdate— which would re-open a circuit prematurely or stretch a cooldown into next week. Don't "simplify" totime.time(). -
Permissive half-open. Once the cooldown elapses, all concurrent
is_available()callers seeTrueuntil one callsrecord_success. This is not classic single-shot half-open (which lets exactly one probe through). It's intentional: when the caller bounds concurrency by other means (e.g., per-key rate limiting), a small burst of probes gives faster recovery. If you need single-shot half-open, wrap it or open an issue — a futureprobe_concurrencyoption may add it as a non-breaking choice. -
Single-event-loop only. State is a plain dict, not lock-guarded. Designed for one event loop per process (the common async deployment). Concurrent mutation from multiple threads or event loops in the same process will corrupt state. Multi-thread safety is out of scope for
0.x. -
Success fully resets the failure count (no decay). One success forgives all accumulated failures, even at threshold-minus-one. Intentional: pick a healthy target back up immediately. A "decay over time" model would keep skipping a recovered target after transient errors.
-
The state dict is unbounded by key. Not a leak when the key space is bounded and application-controlled (a fixed set of models/tenants). If you pass user-supplied keys, this is a memory-growth / DoS vector — bound the key space yourself. LRU eviction may arrive at
1.0if a real consumer needs it. -
The caller decides what counts as a failure. The breaker never inspects responses or status codes. You call
record_failure()explicitly. This is deliberate: an HTTP 429 (rate-limited) usually should not open the circuit; a 500, a timeout, or malformed output usually should. That policy varies per provider and belongs to you, not the breaker. Don't wrap this in "smart" auto-detection. -
The defaults are tuned, not arbitrary.
failure_threshold=3skips flaky targets quickly while absorbing 1–2 transient errors.cooldown_seconds=120.0matches typical provider rate-limit windows. They're calibrated for free-tier LLM provider behavior — override them for your own SLAs. -
Log field names are a contract. Each state transition emits a structured record with both legacy field names (
model_key,failures,cooldown_seconds; eventscircuit_opened/circuit_closed/circuit_half_open) and canonical namespaced fields (keel.lib.name,keel.primitive,keel.event,keel.key,keel.failure_count,keel.cooldown_seconds). Dual emission is preserved throughout the entire0.xlifecycle; the legacy aliases are dropped at1.0.0. Update any log-based dashboards to thekeel.*fields before then.
Status
0.1.0 — first release. Keel stays in 0.x through its first year (breaking changes possible at minor bumps, always documented in the CHANGELOG; pin exact versions). Source lives in the Keel monorepo.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file keel_circuit_breaker-0.1.0.tar.gz.
File metadata
- Download URL: keel_circuit_breaker-0.1.0.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0e88705a3863b5424597c85ec73a69997f71d83a47738e04fcd2aea00713d34
|
|
| MD5 |
b045d0a75e267269d0f111c11742ad52
|
|
| BLAKE2b-256 |
2d1cc872fe46c210df89d12c52e4dce8592507b081e4fd85270679fc45a2b566
|
Provenance
The following attestation bundles were made for keel_circuit_breaker-0.1.0.tar.gz:
Publisher:
publish-py.yml on keelplatform/keel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
keel_circuit_breaker-0.1.0.tar.gz -
Subject digest:
f0e88705a3863b5424597c85ec73a69997f71d83a47738e04fcd2aea00713d34 - Sigstore transparency entry: 1597813626
- Sigstore integration time:
-
Permalink:
keelplatform/keel@683132ec433bcb6a816fbcf21ba3f4c579b27209 -
Branch / Tag:
refs/tags/py-circuit-breaker-v0.1.0 - Owner: https://github.com/keelplatform
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-py.yml@683132ec433bcb6a816fbcf21ba3f4c579b27209 -
Trigger Event:
push
-
Statement type:
File details
Details for the file keel_circuit_breaker-0.1.0-py3-none-any.whl.
File metadata
- Download URL: keel_circuit_breaker-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf2842357d16a9c6afa33935b6a73d218b95e5345f0c10c7d57820b0f8dec5a8
|
|
| MD5 |
4dd00140255c81805ece592ba44c8eba
|
|
| BLAKE2b-256 |
b076666b0d3f17213c94708827d3a8c65b3dba5848f62bcd4c989e5188fef6c5
|
Provenance
The following attestation bundles were made for keel_circuit_breaker-0.1.0-py3-none-any.whl:
Publisher:
publish-py.yml on keelplatform/keel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
keel_circuit_breaker-0.1.0-py3-none-any.whl -
Subject digest:
bf2842357d16a9c6afa33935b6a73d218b95e5345f0c10c7d57820b0f8dec5a8 - Sigstore transparency entry: 1597813670
- Sigstore integration time:
-
Permalink:
keelplatform/keel@683132ec433bcb6a816fbcf21ba3f4c579b27209 -
Branch / Tag:
refs/tags/py-circuit-breaker-v0.1.0 - Owner: https://github.com/keelplatform
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-py.yml@683132ec433bcb6a816fbcf21ba3f4c579b27209 -
Trigger Event:
push
-
Statement type: