Adaptive retry SDK with persistent failure memory
Project description
resilient-sdk
Adaptive retry SDK + CLI for production Python applications.
Every developer writes ad-hoc retry logic - fixed attempts, guessed backoff values, no observability. resilient-sdk replaces that with a zero-config decorator that learns from your failure history and surfaces actionable insights via a CLI.
from resilient import retry
@retry.auto
def call_openai(prompt: str):
return openai.chat.completions.create(...)
$ resilient report
$ resilient explain openai
$ resilient anomalies
How it works
@retry.auto- wraps any function (sync or async), classifies exceptions automatically, and applies exponential backoff with jitter- Postgres persistence - every retry event is written to
resilient.events, multi-pod safe - CLI - queries that data and gives you plain-English reports powered by Gemini
Installation
Python SDK
pip install resilient-sdk-core
Requires Python 3.10+ and a Postgres database.
CLI
Download the binary from GitHub Releases or install with Go:
go install github.com/abhishekgit03/resilient-sdk/cli@latest
Setup
1. Configure
resilient init --dsn postgresql://user:pass@host/dbname --gemini-key AIza...
This writes ~/.resilient/config.toml. The SDK reads the same file.
2. Use the decorator
from resilient import retry
# Works with any external call - HTTP, DB, queue
@retry.auto
def call_stripe():
return stripe.PaymentIntent.create(...)
@retry.auto
async def call_openai(prompt: str):
return await openai.chat.completions.create(...)
3. Optional - Circuit Breaker
Pair with the circuit breaker to stop retrying a service that's fully down:
from resilient import retry
from resilient.circuit import CircuitBreaker
cb = CircuitBreaker(failure_threshold=5, recovery_timeout=30)
@retry.auto
@cb.protect
def call_openai(prompt: str):
...
CLI Commands
| Command | Description |
|---|---|
resilient init --dsn <dsn> --gemini-key <key> |
One-time setup |
resilient report |
Failure summary, last 24h |
resilient report --app openai --last 7d |
Scoped report |
resilient explain <service> |
AI-powered analysis |
resilient explain <service> --last 7d |
Scoped explanation |
resilient anomalies |
Services that spiked vs yesterday |
resilient top |
Worst offenders in the last hour |
Example output
$ resilient explain openai
Analysing openai (last 7d)...
OpenAI calls are failing at 4.2% over the last 7 days, up from 1.1% the week
before. Failures cluster between 14:00–16:00 UTC. The rate_limit errors suggest
you are retrying inside the same rate-limit window. Recommendation: add a 60s
cooldown after 3 consecutive 429s and consider request batching during peak hours.
Error Classification
The SDK classifies exceptions automatically - no configuration needed.
| HTTP Status | Error Type | Strategy |
|---|---|---|
| 429 | rate_limit |
Exponential backoff + jitter, 5 attempts |
| 500/502/503/504 | server_error |
Backoff + jitter, 4 attempts |
| 400/401/403/404 | client_fault |
No retry - fail immediately |
| Timeout exceptions | transient |
Short jitter, 3 attempts |
Works with any HTTP library (httpx, requests, aiohttp) without importing them.
Database Schema
Auto-created on first run:
resilient.events -- one row per retry attempt
resilient.stats -- aggregated windows (populated by CLI queries)
Compatible with any existing Postgres instance. Uses a dedicated resilient schema to avoid conflicts.
Tech Stack
| Layer | Technology |
|---|---|
| SDK | Python + Poetry |
| CLI | Go + Cobra |
| Storage | PostgreSQL |
| AI | Gemini 2.5 Flash |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file resilient_sdk_core-0.1.1.tar.gz.
File metadata
- Download URL: resilient_sdk_core-0.1.1.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.17.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b9db1c12a81040a55fac8b4abe64358e15efa8d89d4cdbb59ddf7322d236a3a
|
|
| MD5 |
df22ce5ff1e1e96e563e631d232da0fa
|
|
| BLAKE2b-256 |
abaf352bedbb50d7a77c8f8b0f96d9d019469cfa7e855742bef9c22dffb4e0cf
|
File details
Details for the file resilient_sdk_core-0.1.1-py3-none-any.whl.
File metadata
- Download URL: resilient_sdk_core-0.1.1-py3-none-any.whl
- Upload date:
- Size: 8.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.17.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12a7bb58b315a3c7fa63747e7932d7d4353efce4ff918bfe762cdc2911403c11
|
|
| MD5 |
92810b03961eb19242129bab2dabf401
|
|
| BLAKE2b-256 |
8eeef50f968f1765b5df0cdb30269300a020b8c7d81e5a5e872f09b5ece2fda3
|