Exponential backoff with full jitter for LLM API calls. Sync + async. Built-in retryable-code presets for Anthropic, OpenAI, Bedrock, Gemini. Zero runtime deps.

These details have not been verified by PyPI

Project links

Project description

llm-retry-py

Exponential backoff with full jitter for LLM API calls.

Most retry libraries are framework middleware or async-only. This one is a small function you wrap around any callable. Sync and async loops share the same RetryPolicy. Built-in retryable-code sets for Anthropic, OpenAI, AWS Bedrock, and Google Gemini ship in llm_retry.predicates.

Sibling to the Rust crate llm-retry.

Install

pip install llm-retry-py

Use

Sync:

from llm_retry import retry, RetryPolicy, predicates

def call_anthropic():
    return client.messages.create(model="claude-sonnet-4-5", ...)

resp = retry(
    call_anthropic,
    policy=RetryPolicy(),
    should_retry=lambda e: predicates.is_anthropic_retryable(str(e)),
)

Async (works with any asyncio code, no runtime lock-in beyond asyncio.sleep):

from llm_retry import retry_async, RetryPolicy, predicates

async def call_openai():
    return await client.chat.completions.create(...)

resp = await retry_async(
    call_openai,
    policy=RetryPolicy(),
    should_retry=lambda e: predicates.is_openai_retryable(str(e)),
)

Custom predicate (BYO matcher):

import httpx

def should_retry(e: Exception) -> bool:
    if isinstance(e, httpx.TimeoutException):
        return True
    if isinstance(e, httpx.HTTPStatusError):
        return predicates.is_http_status_retryable(e.response.status_code)
    return False

resp = retry(call, policy=RetryPolicy(), should_retry=should_retry)

RetryPolicy

Defaults: 6 attempts, 500ms base delay, 30s cap, full jitter.

from llm_retry import RetryPolicy, Jitter

policy = RetryPolicy(
    max_attempts=8,
    base_delay_ms=250,
    max_delay_ms=60_000,
    jitter=Jitter.FULL,  # FULL (default, AWS-recommended), EQUAL, or NONE
)

Backoff for attempt i (0-indexed) is min(base * 2**i, max), then jittered.

Jitter	Resulting delay
`NONE`	`capped`
`EQUAL`	`capped/2 + uniform(0, capped/2)`
`FULL`	`uniform(0, capped)` (recommended)

Presets

from llm_retry import predicates

predicates.is_anthropic_retryable("rate_limit_error")     # True
predicates.is_openai_retryable("server_error")            # True
predicates.is_bedrock_retryable("ThrottlingException")    # True
predicates.is_gemini_retryable("RESOURCE_EXHAUSTED")      # True
predicates.is_http_status_retryable(503)                  # True

The underlying lists are public constants you can extend:

from llm_retry.predicates import ANTHROPIC_RETRYABLE, contains_any

my_codes = ANTHROPIC_RETRYABLE + ("my_custom_transient_code",)
should_retry = lambda e: contains_any(str(e), my_codes)

Error type

A retry that does not succeed raises RetryExhausted:

from llm_retry import RetryExhausted

try:
    resp = retry(call, policy=RetryPolicy(), should_retry=lambda e: True)
except RetryExhausted as exc:
    exc.attempts        # how many tries ran
    exc.last_error      # the final exception raised by `call`

If the predicate returns False for the first error, that error is re-raised unchanged (no wrapping).

What it does NOT do

No HTTP client. Wrap any callable that raises.
No circuit breaker. Layer one on top if you want.
No deadline (stop after N seconds total). Combine with your own asyncio.wait_for or signal.alarm.
No structured logging hooks. Add them in your callable.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_retry_py-0.1.0.tar.gz (8.9 kB view details)

Uploaded May 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_retry_py-0.1.0-py3-none-any.whl (7.9 kB view details)

Uploaded May 24, 2026 Python 3

File details

Details for the file llm_retry_py-0.1.0.tar.gz.

File metadata

Download URL: llm_retry_py-0.1.0.tar.gz
Upload date: May 24, 2026
Size: 8.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for llm_retry_py-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`54f2549812a73ba28b3248f775399721d02e5c01fb7649bbea64121fcc1ba7ca`
MD5	`4a597a83f87da8fc211b4d1d45549dc0`
BLAKE2b-256	`e8d080ddd01a6e4946612470c03ebf08584ae8529a74e9ca3dfbebec4daa6593`

See more details on using hashes here.

File details

Details for the file llm_retry_py-0.1.0-py3-none-any.whl.

File metadata

Download URL: llm_retry_py-0.1.0-py3-none-any.whl
Upload date: May 24, 2026
Size: 7.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for llm_retry_py-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`20859fdbce5ad9057cc4bca56fe828cdb8650d4710a932ed86d81985c718fa5a`
MD5	`4be21d81a74830a235610ab1d1ddc4d0`
BLAKE2b-256	`dae88ca34f9ef423861818a33c29e49570bbd7363804313cf4983f66937dac44`

See more details on using hashes here.

llm-retry-py 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-retry-py

Install

Use

RetryPolicy

Presets

Error type

What it does NOT do

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes