Exponential backoff with full jitter for LLM API calls. Sync + async. Built-in retryable-code presets for Anthropic, OpenAI, Bedrock, Gemini. Zero runtime deps.
Project description
llm-retry-py
Exponential backoff with full jitter for LLM API calls.
Most retry libraries are framework middleware or async-only. This one is
a small function you wrap around any callable. Sync and async loops
share the same RetryPolicy. Built-in retryable-code sets for Anthropic,
OpenAI, AWS Bedrock, and Google Gemini ship in llm_retry.predicates.
Sibling to the Rust crate
llm-retry.
Install
pip install llm-retry-py
Use
Sync:
from llm_retry import retry, RetryPolicy, predicates
def call_anthropic():
return client.messages.create(model="claude-sonnet-4-5", ...)
resp = retry(
call_anthropic,
policy=RetryPolicy(),
should_retry=lambda e: predicates.is_anthropic_retryable(str(e)),
)
Async (works with any asyncio code, no runtime lock-in beyond asyncio.sleep):
from llm_retry import retry_async, RetryPolicy, predicates
async def call_openai():
return await client.chat.completions.create(...)
resp = await retry_async(
call_openai,
policy=RetryPolicy(),
should_retry=lambda e: predicates.is_openai_retryable(str(e)),
)
Custom predicate (BYO matcher):
import httpx
def should_retry(e: Exception) -> bool:
if isinstance(e, httpx.TimeoutException):
return True
if isinstance(e, httpx.HTTPStatusError):
return predicates.is_http_status_retryable(e.response.status_code)
return False
resp = retry(call, policy=RetryPolicy(), should_retry=should_retry)
RetryPolicy
Defaults: 6 attempts, 500ms base delay, 30s cap, full jitter.
from llm_retry import RetryPolicy, Jitter
policy = RetryPolicy(
max_attempts=8,
base_delay_ms=250,
max_delay_ms=60_000,
jitter=Jitter.FULL, # FULL (default, AWS-recommended), EQUAL, or NONE
)
Backoff for attempt i (0-indexed) is min(base * 2**i, max), then jittered.
| Jitter | Resulting delay |
|---|---|
NONE |
capped |
EQUAL |
capped/2 + uniform(0, capped/2) |
FULL |
uniform(0, capped) (recommended) |
Presets
from llm_retry import predicates
predicates.is_anthropic_retryable("rate_limit_error") # True
predicates.is_openai_retryable("server_error") # True
predicates.is_bedrock_retryable("ThrottlingException") # True
predicates.is_gemini_retryable("RESOURCE_EXHAUSTED") # True
predicates.is_http_status_retryable(503) # True
The underlying lists are public constants you can extend:
from llm_retry.predicates import ANTHROPIC_RETRYABLE, contains_any
my_codes = ANTHROPIC_RETRYABLE + ("my_custom_transient_code",)
should_retry = lambda e: contains_any(str(e), my_codes)
Error type
A retry that does not succeed raises RetryExhausted:
from llm_retry import RetryExhausted
try:
resp = retry(call, policy=RetryPolicy(), should_retry=lambda e: True)
except RetryExhausted as exc:
exc.attempts # how many tries ran
exc.last_error # the final exception raised by `call`
If the predicate returns False for the first error, that error is re-raised unchanged (no wrapping).
What it does NOT do
- No HTTP client. Wrap any callable that raises.
- No circuit breaker. Layer one on top if you want.
- No deadline (
stop after N seconds total). Combine with your ownasyncio.wait_fororsignal.alarm. - No structured logging hooks. Add them in your callable.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_retry_py-0.1.0.tar.gz.
File metadata
- Download URL: llm_retry_py-0.1.0.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54f2549812a73ba28b3248f775399721d02e5c01fb7649bbea64121fcc1ba7ca
|
|
| MD5 |
4a597a83f87da8fc211b4d1d45549dc0
|
|
| BLAKE2b-256 |
e8d080ddd01a6e4946612470c03ebf08584ae8529a74e9ca3dfbebec4daa6593
|
File details
Details for the file llm_retry_py-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llm_retry_py-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20859fdbce5ad9057cc4bca56fe828cdb8650d4710a932ed86d81985c718fa5a
|
|
| MD5 |
4be21d81a74830a235610ab1d1ddc4d0
|
|
| BLAKE2b-256 |
dae88ca34f9ef423861818a33c29e49570bbd7363804313cf4983f66937dac44
|