Skip to main content

Adaptive retry SDK with persistent failure memory

Project description

resilient-sdk

Adaptive retry SDK + CLI for production Python applications.

Every developer writes ad-hoc retry logic - fixed attempts, guessed backoff values, no observability. resilient-sdk replaces that with a zero-config decorator that learns from your failure history and surfaces actionable insights via a CLI.

from resilient import retry

@retry.auto
def call_openai(prompt: str):
    return openai.chat.completions.create(...)
$ resilient report
$ resilient explain openai
$ resilient anomalies

How it works

  • @retry.auto - wraps any function (sync or async), classifies exceptions automatically, and applies exponential backoff with jitter
  • Postgres persistence - every retry event is written to resilient.events, multi-pod safe
  • CLI - queries that data and gives you plain-English reports powered by Gemini

Installation

Python SDK

pip install resilient-sdk-core

Requires Python 3.10+ and a Postgres database.

CLI

Download the binary from GitHub Releases or install with Go:

go install github.com/abhishekgit03/resilient-sdk/cli@latest

Setup

1. Configure

resilient init --dsn postgresql://user:pass@host/dbname --gemini-key AIza...

This writes ~/.resilient/config.toml. The SDK reads the same file.

2. Use the decorator

from resilient import retry

# Works with any external call - HTTP, DB, queue
@retry.auto
def call_stripe():
    return stripe.PaymentIntent.create(...)

@retry.auto
async def call_openai(prompt: str):
    return await openai.chat.completions.create(...)

3. Optional - Circuit Breaker

Pair with the circuit breaker to stop retrying a service that's fully down:

from resilient import retry
from resilient.circuit import CircuitBreaker

cb = CircuitBreaker(failure_threshold=5, recovery_timeout=30)

@retry.auto
@cb.protect
def call_openai(prompt: str):
    ...

CLI Commands

Command Description
resilient init --dsn <dsn> --gemini-key <key> One-time setup
resilient report Failure summary, last 24h
resilient report --app openai --last 7d Scoped report
resilient explain <service> AI-powered analysis
resilient explain <service> --last 7d Scoped explanation
resilient anomalies Services that spiked vs yesterday
resilient top Worst offenders in the last hour

Example output

$ resilient explain openai

Analysing openai (last 7d)...

OpenAI calls are failing at 4.2% over the last 7 days, up from 1.1% the week
before. Failures cluster between 14:00–16:00 UTC. The rate_limit errors suggest
you are retrying inside the same rate-limit window. Recommendation: add a 60s
cooldown after 3 consecutive 429s and consider request batching during peak hours.

Error Classification

The SDK classifies exceptions automatically - no configuration needed.

HTTP Status Error Type Strategy
429 rate_limit Exponential backoff + jitter, 5 attempts
500/502/503/504 server_error Backoff + jitter, 4 attempts
400/401/403/404 client_fault No retry - fail immediately
Timeout exceptions transient Short jitter, 3 attempts

Works with any HTTP library (httpx, requests, aiohttp) without importing them.


Database Schema

Auto-created on first run:

resilient.events  -- one row per retry attempt
resilient.stats   -- aggregated windows (populated by CLI queries)

Compatible with any existing Postgres instance. Uses a dedicated resilient schema to avoid conflicts.


Tech Stack

Layer Technology
SDK Python + Poetry
CLI Go + Cobra
Storage PostgreSQL
AI Gemini 2.5 Flash

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resilient_sdk_core-0.1.0.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

resilient_sdk_core-0.1.0-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file resilient_sdk_core-0.1.0.tar.gz.

File metadata

  • Download URL: resilient_sdk_core-0.1.0.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.17.0-1018-azure

File hashes

Hashes for resilient_sdk_core-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6e3a15d1078e4d3a5e32729c4cad22809096c283acf0bf72587fba79a9910c58
MD5 7501900e9252e7ed2532a3f31671c156
BLAKE2b-256 e971c950dfcd65399aa426cbc05c50e3c1bc9d2ab2b677253a6035734f409c3a

See more details on using hashes here.

File details

Details for the file resilient_sdk_core-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: resilient_sdk_core-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.17.0-1018-azure

File hashes

Hashes for resilient_sdk_core-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d058d6c99f9ed464e725d3f3d5b263415f4dc47371a89fa5832812a5ff9b1997
MD5 29eec9217bb9490b1c2dea8005888f2b
BLAKE2b-256 719d41799a3adef46551afb54370cc744c0c2a5fd9178a5471defb75f755c9ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page