Chaos proxy for testing LLM agent resilience — inject faults like latency, errors, and bad responses. Supports OpenAI, Anthropic, and MCP.
Project description
AgentBreak
Your agent works great — until the LLM times out, returns garbage, or an MCP tool fails. AgentBreak lets you test for that before production.
It's a chaos proxy that sits between your agent and the real API, injecting faults like latency spikes, HTTP errors, and malformed responses so you can see how your agent actually handles failure.
Agent --> AgentBreak (localhost:5005) --> Real LLM / MCP server
^
injects faults based on your scenarios
Get started
pip install agentbreak
agentbreak init # creates .agentbreak/ with default configs
agentbreak serve # start the chaos proxy on port 5005
Point your agent at http://localhost:5005 instead of the real API:
# OpenAI
export OPENAI_BASE_URL=http://localhost:5005/v1
# Anthropic
export ANTHROPIC_BASE_URL=http://localhost:5005
Run your agent, then check how it did:
curl localhost:5005/_agentbreak/scorecard
That's it. No code changes needed — just swap the base URL.
How it works
AgentBreak reads two files from .agentbreak/:
application.yaml— what to proxy (LLM mode, MCP upstream, port)scenarios.yaml— what faults to inject
A scenario is just a target + a fault + a schedule:
scenarios:
- name: slow-llm
summary: Latency spike on completions
target: llm_chat # what to hit (llm_chat or mcp_tool)
fault:
kind: latency # what goes wrong
min_ms: 2000
max_ms: 5000
schedule:
mode: random # when it happens
probability: 0.3
Don't want to write YAML? Use a preset:
preset: brownout
Available presets: standard, standard-mcp, standard-all, brownout, mcp-slow-tools, mcp-tool-failures, mcp-mixed-transient.
MCP testing
agentbreak inspect # discover tools from your MCP server
agentbreak serve # proxy both LLM and MCP traffic
Track resilience over time
# in .agentbreak/application.yaml
history:
enabled: true
agentbreak serve --label "added retry logic"
agentbreak history compare 1 2 # diff two runs
Claude Code
AgentBreak works as a plugin for Claude Code:
pip install agentbreak
Then in Claude Code:
/plugin marketplace add mnvsk97/agentbreak
/plugin install agentbreak@mnvsk97-agentbreak
/reload-plugins
Three commands:
| Command | What it does |
|---|---|
/agentbreak:init |
Analyze codebase, configure mock/proxy mode |
/agentbreak:create-tests |
Generate project-specific chaos scenarios |
/agentbreak:run-tests |
Run tests, produce resilience report with fixes |
Update to latest:
/plugin marketplace add mnvsk97/agentbreak
/plugin install agentbreak@mnvsk97-agentbreak
/reload-plugins
Uninstall:
/plugin uninstall agentbreak@mnvsk97-agentbreak
/reload-plugins
What it actually measures
AgentBreak doesn't score you on whether faults happen — it injected those on purpose. It scores what your agent does after the fault.
Agent sends request → AgentBreak injects 500 error → Agent retries → Success
^^^^^^^^^^^^^^^^^^^^^^^^
This is what gets scored
- Agent retries and succeeds → recovery (+5)
- Agent gives up after one failure → upstream failure (-12)
- Agent retries the same thing 20 times → suspected loop (-10)
When you run through mock mode with direct curl, there's no agent in the loop — so there's nothing to evaluate beyond confirming faults fire. The real value comes from running in proxy mode through your actual agent, where its retry logic, error handling, and framework behavior all get exercised.
CI/CD
Run chaos tests in your pipeline using mock mode — no API keys needed.
GitHub Actions:
- name: Chaos test
run: |
pip install agentbreak
agentbreak init
agentbreak serve &
sleep 2
# send test traffic
for i in $(seq 1 10); do
curl -s http://localhost:5005/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer dummy" \
-d "{\"model\":\"gpt-4o\",\"messages\":[{\"role\":\"user\",\"content\":\"test $i\"}]}" &
done
wait
# check score — fail the build if below threshold
SCORE=$(curl -s http://localhost:5005/_agentbreak/scorecard | python3 -c "import sys,json; print(json.load(sys.stdin)['score'])")
echo "Resilience score: $SCORE"
pkill -f "agentbreak serve" || true
python3 -c "exit(0 if $SCORE >= 60 else 1)"
For proxy mode (real API traffic), set OPENAI_API_KEY or ANTHROPIC_API_KEY as a repository secret and configure .agentbreak/application.yaml with mode: proxy.
Commit your .agentbreak/application.yaml and .agentbreak/scenarios.yaml to the repo so CI uses the same config.
Full reference
For the full list of fault kinds, schedule modes, match filters, and config options, see the documentation.
Roadmap
- Security scenarios — prompt injection, data exfiltration attempts, and adversarial inputs
- MCP server chaos — intentional tool call validation, schema mismatches, and poisoned tool responses
- Pattern-based attacks — multi-step attack chains that exploit common agent reasoning patterns
- Skill-based attacks — target agent skills/capabilities with adversarial tool sequences
- Deprecated library injection — return responses referencing deprecated or vulnerable libraries
- Model deprecation simulation — simulate model sunset responses and version migration failures
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentbreak-0.4.6.tar.gz.
File metadata
- Download URL: agentbreak-0.4.6.tar.gz
- Upload date:
- Size: 59.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ae27db5c9bfdde52a6ebbd42bac61ce86f42c320c0ee284555bd6b3baa5cd97
|
|
| MD5 |
11f4054d76414c9c6eaab0a7ef2705ff
|
|
| BLAKE2b-256 |
fd046d0f896d40a0434c142fb2d5c71932b140a0506da0d3f998d066bc8a0f1d
|
Provenance
The following attestation bundles were made for agentbreak-0.4.6.tar.gz:
Publisher:
ci.yml on mnvsk97/agentbreak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentbreak-0.4.6.tar.gz -
Subject digest:
0ae27db5c9bfdde52a6ebbd42bac61ce86f42c320c0ee284555bd6b3baa5cd97 - Sigstore transparency entry: 1276941557
- Sigstore integration time:
-
Permalink:
mnvsk97/agentbreak@1b7c5c71b4cafd9df09eff08452b268b06de72bc -
Branch / Tag:
refs/heads/main - Owner: https://github.com/mnvsk97
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@1b7c5c71b4cafd9df09eff08452b268b06de72bc -
Trigger Event:
push
-
Statement type:
File details
Details for the file agentbreak-0.4.6-py3-none-any.whl.
File metadata
- Download URL: agentbreak-0.4.6-py3-none-any.whl
- Upload date:
- Size: 44.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5087cb4b52089c5c12bcf75ca023861dcf7edd372e912f20cf5c8344a378784c
|
|
| MD5 |
b1bbd55134a87e369ba6008a14a6a6bb
|
|
| BLAKE2b-256 |
f2fa6f29396ed4c9d308681aeedfcba9ff2cf74c80304faf1e8e9da6d1032af2
|
Provenance
The following attestation bundles were made for agentbreak-0.4.6-py3-none-any.whl:
Publisher:
ci.yml on mnvsk97/agentbreak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentbreak-0.4.6-py3-none-any.whl -
Subject digest:
5087cb4b52089c5c12bcf75ca023861dcf7edd372e912f20cf5c8344a378784c - Sigstore transparency entry: 1276941562
- Sigstore integration time:
-
Permalink:
mnvsk97/agentbreak@1b7c5c71b4cafd9df09eff08452b268b06de72bc -
Branch / Tag:
refs/heads/main - Owner: https://github.com/mnvsk97
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@1b7c5c71b4cafd9df09eff08452b268b06de72bc -
Trigger Event:
push
-
Statement type: