Production-grade fault tolerance for AI agentsidempotency, loop detection, fallback chains, async support, health monitoring, and budget enforcement for LangChain, AutoGen, CrewAI, and any LLM pipeline

These details have not been verified by PyPI

Project links

Project description

agentguard-llm Banner

agentguard — Production-Grade Fault Tolerance for AI Agents

agentguard is a production-ready Python library that adds circuit breakers, LLM-aware retry logic, idempotency, loop detection, and timeout enforcement to any AI agent or LLM pipeline.

AI agents fail at 91%+ rates in production. agentguard stops that.

The Problem

AI agents built with LangChain, AutoGen, CrewAI, or custom LLM pipelines fail catastrophically in production due to:

Infinite loops — agents repeat the same tool calls indefinitely
Silent failures — LLM errors swallowed without retry or alerting
Duplicate actions — the same expensive LLM call fires multiple times
Rate limit crashes — no intelligent backoff for 429/503 errors
Token limit blindness — agents don't know when to stop or summarize
No circuit breaking — one bad model call cascades into total failure

Existing tools like tenacity, LangGraph, and CrewAI are not LLM-aware and don't address agent-specific failure modes.

Features

Circuit Breaker — Automatically opens after N failures, protecting downstream LLM APIs from cascading overload
LLM-Aware Retry — Classifies errors (rate limit, token limit, provider outage, hallucinated tool call) and applies appropriate backoff
Idempotency — Caches results by key to prevent duplicate expensive LLM executions
Loop Detection — Detects and halts infinite agent action loops before they run up your API bill
Timeout Enforcement — Hard timeouts on any agent step, with clean error propagation
Zero Dependencies — Pure Python standard library only; works with any LLM framework
Full Observability — Built-in stats, logging at every layer, structured error types

Installation

pip install agentguard

Quick Start

GuardedAgent — Full Protection in One Wrapper

from agentguard import GuardedAgent

agent = GuardedAgent(
    name="my_llm_agent",
    max_retries=3,
    circuit_threshold=5,
    timeout=30.0,
    loop_detection=True,
    max_repeated_actions=3,
)

def call_llm(prompt: str) -> str:
    # Your actual LLM call here (OpenAI, Anthropic, etc.)
    return f"Response to: {prompt}"

result = agent.run(call_llm, "What is the capital of France?", action_label="llm_call")
print(result)
print(agent.get_stats())
# {'name': 'my_llm_agent', 'total_calls': 1, 'total_failures': 0, 'circuit': {...}}

Circuit Breaker — Standalone

from agentguard import CircuitBreaker

cb = CircuitBreaker(failure_threshold=3, recovery_timeout=60.0, name="openai")

try:
    response = cb.call(call_llm, "Hello")
except Exception as e:
    print(f"Protected from cascading failure: {e}")

print(cb.get_stats())
# {'name': 'openai', 'state': 'closed', 'failure_count': 0, ...}

LLM-Aware Retry — Decorator Style

from agentguard import llm_retry

@llm_retry(max_attempts=3)
def my_agent_step(query: str) -> str:
    # Automatically retries on rate limits (429), provider outages (503)
    # Stops immediately on token limit errors (non-retryable)
    return call_llm(query)

result = my_agent_step("Summarize this document")

LLM-Aware Retry — Programmatic

from agentguard import LLMRetry, FailureClassifier

retry = LLMRetry(max_attempts=5, on_retry=lambda attempt, ftype, err: print(f"Retry {attempt}: {ftype}"))
result = retry.execute(call_llm, "Hello")

Idempotency — Prevent Duplicate Executions

from agentguard import IdempotentAgent

agent = IdempotentAgent(ttl=3600.0)

# First call executes the function
result1 = agent.run(call_llm, "Summarize report", idempotency_key="report-summary-v1")

# Second call with same key returns cached result instantly
result2 = agent.run(call_llm, "Summarize report", idempotency_key="report-summary-v1")

assert result1 == result2  # True — function only ran once

Failure Classifier — Understand What Went Wrong

from agentguard import FailureClassifier, FailureType

fc = FailureClassifier()

err = Exception("429 Too Many Requests: rate limit exceeded")
ftype = fc.classify(err)
# FailureType.RATE_LIMIT

print(fc.is_retryable(ftype))       # True
print(fc.get_retry_delay(ftype, attempt=1))  # 10.0 seconds

Architecture

agentguard/
├── agent_wrapper.py      # GuardedAgent — orchestrates all protections
├── circuit_breaker.py    # CircuitBreaker — CLOSED/OPEN/HALF_OPEN state machine
├── retry.py              # LLMRetry + llm_retry decorator — intelligent backoff
├── idempotency.py        # IdempotentAgent + IdempotencyStore — result caching
├── failure_classifier.py # FailureClassifier — LLM error pattern recognition
└── exceptions.py         # Typed exception hierarchy

Data Flow

Agent call
    │
    ├─► Loop Detector (infinite loop check)
    │
    ├─► Circuit Breaker (OPEN? → reject immediately)
    │
    ├─► Idempotency Store (seen this key? → return cached)
    │
    ├─► LLM Retry (classify failure → smart backoff → retry)
    │
    └─► Timeout Thread (hard deadline enforcement)

Failure Classification Logic

Error Pattern	Classified As	Retryable	Backoff
`429`, `rate limit`, `quota exceeded`	RATE_LIMIT	Yes	Exponential (5s base, max 60s)
`503`, `service unavailable`, `overloaded`	PROVIDER_OUTAGE	Yes	Exponential (10s base, max 120s)
`context length`, `token limit`	TOKEN_LIMIT	No	—
`tool not found`, `invalid tool`	HALLUCINATED_TOOL_CALL	No	—
anything else	UNKNOWN	Yes	Exponential (1s base, max 30s)

API Reference

`GuardedAgent`

GuardedAgent(
    name: str = "agent",
    max_retries: int = 3,
    circuit_threshold: int = 5,
    circuit_recovery: float = 60.0,
    timeout: Optional[float] = None,
    loop_detection: bool = True,
    max_repeated_actions: int = 3,
    idempotency_ttl: float = 3600.0,
    enable_idempotency: bool = True,
)

Methods:

run(func, *args, action_label=None, idempotency_key=None, **kwargs) — Execute with all protections
get_stats() — Returns dict with call counts, failure counts, circuit state
reset_loop_detector() — Clear the loop detection history

`CircuitBreaker`

CircuitBreaker(
    failure_threshold: int = 5,
    recovery_timeout: float = 60.0,
    half_open_max_calls: int = 1,
    name: str = "default"
)

Methods:

call(func, *args, **kwargs) — Protected function call
get_stats() — Returns state, failure count, last failure time

`LLMRetry`

LLMRetry(
    max_attempts: int = 3,
    classifier: Optional[FailureClassifier] = None,
    on_retry: Optional[Callable] = None,
)

Methods:

execute(func, *args, **kwargs) — Execute with retry logic

`llm_retry` decorator

@llm_retry(max_attempts=3, classifier=None)
def my_func(): ...

`IdempotentAgent`

IdempotentAgent(store=None, ttl=3600.0)

Methods:

run(func, *args, idempotency_key=None, **kwargs) — Execute with deduplication

`FailureClassifier`

Methods:

classify(error: Exception) -> FailureType
is_retryable(failure_type: FailureType) -> bool
get_retry_delay(failure_type: FailureType, attempt: int) -> float

Exceptions

Exception	When Raised
`AgentGuardError`	Base class for all agentguard errors
`CircuitOpenError`	Circuit breaker is OPEN, call rejected
`MaxRetriesExceededError`	All retry attempts exhausted
`IdempotencyError`	Idempotency store conflict
`AgentTimeoutError`	Agent exceeded timeout limit

Real-World Integration Examples

With OpenAI

import openai
from agentguard import GuardedAgent

agent = GuardedAgent(name="openai-agent", max_retries=3, timeout=30.0)

def gpt_call(prompt: str) -> str:
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

result = agent.run(gpt_call, "Explain quantum computing", action_label="gpt_call")

With LangChain

from agentguard import llm_retry

@llm_retry(max_attempts=3)
def run_chain(input_text: str):
    return my_langchain_chain.invoke({"input": input_text})

With CrewAI / AutoGen

from agentguard import GuardedAgent

guard = GuardedAgent(name="crew-agent", loop_detection=True, max_repeated_actions=5)

# Wrap any crew task execution
result = guard.run(crew.kickoff, action_label="crew_task")

Contributing

Contributions are welcome! Here's how to get started:

Fork the repository on GitHub
Create a feature branch: git checkout -b feature/your-feature
Make your changes and add tests
Run the test suite: pytest tests/ -v
Submit a pull request

Please open an issue first for major changes to discuss the approach.

Author

Mahesh Makvana — GitHub · PyPI

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.13

May 17, 2026

0.2.4

May 16, 2026

0.2.0

Apr 10, 2026

0.1.0

Apr 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentguard_llm-0.2.13.tar.gz (20.2 kB view details)

Uploaded May 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentguard_llm-0.2.13-py3-none-any.whl (18.4 kB view details)

Uploaded May 17, 2026 Python 3

File details

Details for the file agentguard_llm-0.2.13.tar.gz.

File metadata

Download URL: agentguard_llm-0.2.13.tar.gz
Upload date: May 17, 2026
Size: 20.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for agentguard_llm-0.2.13.tar.gz
Algorithm	Hash digest
SHA256	`719c3941e38063f05d7e394aa22cf47dde5e86444d27d10c936d66a01601a341`
MD5	`139ec20f8b82f41973b14f7245442dc5`
BLAKE2b-256	`b81b2e3ab1b3edaf69cf99e47de06aac8f466b551902f806429ac7cfbe436d23`

See more details on using hashes here.

File details

Details for the file agentguard_llm-0.2.13-py3-none-any.whl.

File metadata

Download URL: agentguard_llm-0.2.13-py3-none-any.whl
Upload date: May 17, 2026
Size: 18.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for agentguard_llm-0.2.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`11a947f4287a8048bc78f4f9da4fa7316db31cd73b3c5889c388d1cdb5e368a8`
MD5	`eee246287ae9ccc7f3e3b6a3d46bb6bb`
BLAKE2b-256	`dc473757d9636de9c016b8fa2e68966ed1cc6f34b523b1601430e7183b16c4f9`

See more details on using hashes here.

agentguard-llm 0.2.13

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agentguard — Production-Grade Fault Tolerance for AI Agents

The Problem

Features

Installation

Quick Start

GuardedAgent — Full Protection in One Wrapper

Circuit Breaker — Standalone

LLM-Aware Retry — Decorator Style

LLM-Aware Retry — Programmatic

Idempotency — Prevent Duplicate Executions

Failure Classifier — Understand What Went Wrong

Architecture

Data Flow

Failure Classification Logic

API Reference

GuardedAgent

CircuitBreaker

LLMRetry

llm_retry decorator

IdempotentAgent

FailureClassifier

Exceptions

Real-World Integration Examples

With OpenAI

With LangChain

With CrewAI / AutoGen

Contributing

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`GuardedAgent`

`CircuitBreaker`

`LLMRetry`

`llm_retry` decorator

`IdempotentAgent`

`FailureClassifier`