Chaos engineering for AI agents — inject realistic production failures (tool timeouts, malformed responses, cost spirals, prompt injection) and find out what breaks before your users do.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

SubhashPavan

These details have not been verified by PyPI

Project description

agentfuzz

Chaos engineering for AI agents.

Your agent works in the demo. In production it breaks because a tool times out, an API returns garbage JSON, a user injects a prompt, or it spirals into an infinite tool-call loop burning $200 in tokens. agentfuzz finds those failures before your users do.

Why this exists

Netflix built Chaos Monkey because cloud apps that passed unit tests still went down in production — the failures were in the seams between systems, not the systems themselves. AI agents have the same problem, with a worse blast radius:

A flaky tool returns malformed JSON → your agent hallucinates plausible-looking arguments and writes them to your database.
A user pastes a "translate this" prompt that's actually IGNORE PREVIOUS INSTRUCTIONS → your support agent emails the customer your system prompt.
A model upgrade changes how the agent retries a 429 → the agent enters an infinite loop and burns through your monthly token budget in 40 minutes.

These failures don't show up in unit tests because unit tests assume the seams work. agentfuzz deliberately breaks the seams.

What it does

Wrap your agent. Pick a fault profile. Run. Get a report.

from agentfuzz import Harness, faults

harness = Harness(my_agent)

harness.add(faults.ToolTimeout(rate=0.10))
harness.add(faults.MalformedToolResponse(rate=0.05))
harness.add(faults.PromptInjection.suite("owasp-llm01"))
harness.add(faults.CostSpiral(max_tokens=50_000))
harness.add(faults.LatencyJitter(p99_ms=8000))
harness.add(faults.PartialToolFailure())

report = harness.run(scenarios="tau-bench-airline", iterations=200)
report.html("./report.html")

You get:

Pass-rate per fault category — "your agent survives malformed JSON 78% of the time but only 12% of timeout cases."
Cost-blast radius — "fault X caused token usage to spike 14×."
Tool-call failure modes — hallucinated arguments, retry storms, infinite loops.
Prompt-injection survival — OWASP LLM01 suite results.
Replay traces — the exact transcript that broke your agent, so you can fix it.

Install

pip install agentfuzz                       # core
pip install "agentfuzz[langgraph]"          # + LangGraph adapter
pip install "agentfuzz[crewai]"             # + CrewAI adapter
pip install "agentfuzz[autogen]"            # + AutoGen adapter
pip install "agentfuzz[all]"                # everything

60-second example

from agentfuzz import Harness, faults
from my_app import build_agent

harness = Harness(build_agent())
harness.add(faults.MalformedToolResponse(rate=0.2))
harness.add(faults.ToolTimeout(rate=0.1))

result = harness.run(iterations=50)
print(result.summary())
# >>> agentfuzz: 32/50 passed (64%)
# >>>   MalformedToolResponse: 8 failures
# >>>     - 5× hallucinated arguments
# >>>     - 3× silent corruption
# >>>   ToolTimeout: 10 failures
# >>>     - 7× retry storm (avg 14 retries)
# >>>     - 3× infinite loop killed at max_tokens

Fault library

Fault	What it simulates
`ToolTimeout`	A downstream API hangs past the agent's patience
`MalformedToolResponse`	Garbage JSON, truncated payloads, wrong schema
`PartialToolFailure`	Tool returns 200 then errors mid-stream
`LatencyJitter`	Realistic p50 / p99 latency distribution
`CostSpiral`	Detects runaway token usage above a threshold
`PromptInjection`	OWASP LLM01 catalog of injection payloads
`PromptParaphrase`	Real users mangle messages — typos, filler, contractions
`RateLimitBurst`	Cascading 429s from upstream APIs
`SchemaDrift`	Tool API changed shape between dev and prod
`AuthExpiry`	401 / 403 — tests credential-refresh paths
`NetworkPartition`	Connection refused / TLS error — distinct from timeout

More planned — see the roadmap.

Supported agent frameworks

✅ LangChain create_agent (1.x) — agentfuzz[langgraph]. The modern entry point. Wrap your tools with wrap_tools(), point LangGraphAdapter at the compiled graph.
✅ LangGraph create_react_agent (0.x) — same adapter; both APIs return a CompiledStateGraph we handle uniformly. See examples/langgraph_react_agent.py.
✅ CrewAI — agentfuzz[crewai]. wrap_tools() returns proxy crewai.tools.BaseTool instances; CrewAIAdapter(crew) drives the harness through crew.kickoff(). See examples/crewai_agent.py.
✅ AutoGen v0.4+ — agentfuzz[autogen]. wrap_tools() returns proxy autogen_core.tools.FunctionTool instances; AutoGenAdapter(agent) drives any agent / team exposing async run(task=...). See examples/autogen_agent.py.
✅ Plain Python callables — any Callable[[State], State]. Simplest way to try the tool.
🚧 PydanticAI, OpenAI Swarm, LlamaIndex — coming.

The adapter interface is small (is_available() + wrap()); PRs welcome.

Status

Alpha (v0.1). API will change. Built and tested on Python 3.10–3.13. The fault catalog is informed by production multi-agent deployments at enterprise scale — but every codebase fails in its own special way, so file issues when you find a fault we should ship.

Why I'm building this

I've spent the last decade architecting AI systems for enterprises — including multi-agent platforms running across 2,600+ production sites. The failures that hurt are almost never the ones the unit tests check for. They're the quiet, partial, half-degraded ones in the seams.

This is the tool I wish I'd had.

— Pavan Subhash Tirumalasetti

License

Apache 2.0. Use it commercially. Cite it in papers. Build a paid product on top. Just don't claim you wrote it.

Citing

If you use agentfuzz in research or production reports:

@software{agentfuzz,
  author  = {Tirumalasetti, Pavan Subhash},
  title   = {agentfuzz: Chaos engineering for AI agents},
  year    = {2026},
  url     = {https://github.com/SubhashPavan/agentfuzz},
}

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

SubhashPavan

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.0

May 18, 2026

0.3.0

May 18, 2026

0.2.0

May 18, 2026

0.1.0

May 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentfuzz-0.4.0.tar.gz (34.2 kB view details)

Uploaded May 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentfuzz-0.4.0-py3-none-any.whl (49.5 kB view details)

Uploaded May 18, 2026 Python 3

File details

Details for the file agentfuzz-0.4.0.tar.gz.

File metadata

Download URL: agentfuzz-0.4.0.tar.gz
Upload date: May 18, 2026
Size: 34.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentfuzz-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`c4cb6fc8ff432f4a35a396d7b0554de52139bab3fea1c416d7fd420245ebfc64`
MD5	`36c3160902130b0e3d77162ce3cc7316`
BLAKE2b-256	`a0f3e6b09b1d9e13088a3126a590521c4419d66ba98123fcece56eceedb95435`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentfuzz-0.4.0.tar.gz:

Publisher: publish.yml on SubhashPavan/agentfuzz

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentfuzz-0.4.0.tar.gz
- Subject digest: c4cb6fc8ff432f4a35a396d7b0554de52139bab3fea1c416d7fd420245ebfc64
- Sigstore transparency entry: 1568408053
- Sigstore integration time: May 18, 2026
Source repository:
- Permalink: SubhashPavan/agentfuzz@d4b65b12111358a3be5fad988aaf33d01387f0ab
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/SubhashPavan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d4b65b12111358a3be5fad988aaf33d01387f0ab
- Trigger Event: release

File details

Details for the file agentfuzz-0.4.0-py3-none-any.whl.

File metadata

Download URL: agentfuzz-0.4.0-py3-none-any.whl
Upload date: May 18, 2026
Size: 49.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentfuzz-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`919661b9dd193b8f792ec3b3f189438395e88cb6fae0bb9a67c3822567528f69`
MD5	`d91adbbfc180b03bce932ff8bea82ee7`
BLAKE2b-256	`1efcae9418011a32031852a4571ef39a5596e29ab12fc47407e1f0f845689439`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentfuzz-0.4.0-py3-none-any.whl:

Publisher: publish.yml on SubhashPavan/agentfuzz

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentfuzz-0.4.0-py3-none-any.whl
- Subject digest: 919661b9dd193b8f792ec3b3f189438395e88cb6fae0bb9a67c3822567528f69
- Sigstore transparency entry: 1568408106
- Sigstore integration time: May 18, 2026
Source repository:
- Permalink: SubhashPavan/agentfuzz@d4b65b12111358a3be5fad988aaf33d01387f0ab
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/SubhashPavan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d4b65b12111358a3be5fad988aaf33d01387f0ab
- Trigger Event: release

agentfuzz 0.4.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

agentfuzz

Why this exists

What it does

Install

60-second example

Fault library

Supported agent frameworks

Status

Why I'm building this

License

Citing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance