Skip to main content

Adaptive retry SDK with persistent failure memory

Project description

resilient-sdk

Adaptive retry SDK + CLI for production Python applications.

Every developer writes ad-hoc retry logic - fixed attempts, guessed backoff values, no observability. resilient-sdk replaces that with a zero-config decorator that learns from your failure history and surfaces actionable insights via a CLI.

from resilient import retry

@retry.auto
def call_openai(prompt: str):
    return openai.chat.completions.create(...)
$ resilient report
$ resilient explain openai
$ resilient anomalies

How it works

  • @retry.auto - wraps any function (sync or async), classifies exceptions automatically, and applies exponential backoff with jitter
  • Postgres persistence - every retry event is written to resilient.events, multi-pod safe
  • CLI - queries that data and gives you plain-English reports powered by Gemini

Installation

Python SDK

pip install resilient-sdk-core

Requires Python 3.10+ and a Postgres database.

CLI

Download the binary from GitHub Releases or install with Go:

go install github.com/abhishekgit03/resilient-sdk/cli@latest

Setup

1. Configure

resilient init --dsn postgresql://user:pass@host/dbname --gemini-key AIza...

This writes ~/.resilient/config.toml. The SDK reads the same file.

2. Use the decorator

from resilient import retry

# Works with any external call - HTTP, DB, queue
@retry.auto
def call_stripe():
    return stripe.PaymentIntent.create(...)

@retry.auto
async def call_openai(prompt: str):
    return await openai.chat.completions.create(...)

3. Optional - Circuit Breaker

Pair with the circuit breaker to stop retrying a service that's fully down:

from resilient import retry
from resilient.circuit import CircuitBreaker

cb = CircuitBreaker(failure_threshold=5, recovery_timeout=30)

@retry.auto
@cb.protect
def call_openai(prompt: str):
    ...

CLI Commands

Command Description
resilient init --dsn <dsn> --gemini-key <key> One-time setup
resilient report Failure summary, last 24h
resilient report --app openai --last 7d Scoped report
resilient explain <service> AI-powered analysis
resilient explain <service> --last 7d Scoped explanation
resilient anomalies Services that spiked vs yesterday
resilient top Worst offenders in the last hour

Example output

$ resilient explain openai

Analysing openai (last 7d)...

OpenAI calls are failing at 4.2% over the last 7 days, up from 1.1% the week
before. Failures cluster between 14:00–16:00 UTC. The rate_limit errors suggest
you are retrying inside the same rate-limit window. Recommendation: add a 60s
cooldown after 3 consecutive 429s and consider request batching during peak hours.

Error Classification

The SDK classifies exceptions automatically - no configuration needed.

HTTP Status Error Type Strategy
429 rate_limit Exponential backoff + jitter, 5 attempts
500/502/503/504 server_error Backoff + jitter, 4 attempts
400/401/403/404 client_fault No retry - fail immediately
Timeout exceptions transient Short jitter, 3 attempts

Works with any HTTP library (httpx, requests, aiohttp) without importing them.


Database Schema

Auto-created on first run:

resilient.events  -- one row per retry attempt
resilient.stats   -- aggregated windows (populated by CLI queries)

Compatible with any existing Postgres instance. Uses a dedicated resilient schema to avoid conflicts.


Tech Stack

Layer Technology
SDK Python + Poetry
CLI Go + Cobra
Storage PostgreSQL
AI Gemini 2.5 Flash

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resilient_sdk_core-0.1.1.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

resilient_sdk_core-0.1.1-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file resilient_sdk_core-0.1.1.tar.gz.

File metadata

  • Download URL: resilient_sdk_core-0.1.1.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.17.0-1018-azure

File hashes

Hashes for resilient_sdk_core-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1b9db1c12a81040a55fac8b4abe64358e15efa8d89d4cdbb59ddf7322d236a3a
MD5 df22ce5ff1e1e96e563e631d232da0fa
BLAKE2b-256 abaf352bedbb50d7a77c8f8b0f96d9d019469cfa7e855742bef9c22dffb4e0cf

See more details on using hashes here.

File details

Details for the file resilient_sdk_core-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: resilient_sdk_core-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.12.13 Linux/6.17.0-1018-azure

File hashes

Hashes for resilient_sdk_core-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 12a7bb58b315a3c7fa63747e7932d7d4353efce4ff918bfe762cdc2911403c11
MD5 92810b03961eb19242129bab2dabf401
BLAKE2b-256 8eeef50f968f1765b5df0cdb30269300a020b8c7d81e5a5e872f09b5ece2fda3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page