Open-source red teaming toolkit for AI agents, RAG systems, MCP servers, and tool-using LLM applications.
Project description
AgentGuardian
Open-source red-team testing toolkit for agentic AI systems.
96 attack probes · 11 attacker agents · OWASP ASI 2026 (all 10), 11+ MITRE ATLAS v5.4.0 techniques (see coverage matrix for the exact set; ~85% of techniques are out of scope for a black-box agent scanner) and CSA Agentic-RT (all 12) mappings · SARIF + PDF reports · runs offline.
What it is
AgentGuardian is a testing toolkit that runs adversarial probes against your agent — LangGraph, CrewAI, OpenAI Agents SDK, AutoGen, ADK, Strands, or any HTTP endpoint — and produces a signed-style evidence bundle you can hand to security review.
It ships 96 attack probes organised against three public taxonomies:
- OWASP Top 10 for Agentic Applications 2026 (ASI01–ASI10, all 10 categories covered)
- MITRE ATLAS v5.4.0 (February 2026 release) — 11+ techniques covered at the agent's I/O surface; the remaining ~85% of the v5.4.0 catalogue (training-pipeline / ML-platform-internal attacks) is out of scope for a black-box agent scanner. See the framework-coverage matrix for the exact set.
- CSA Agentic AI Red Teaming Guide (Huang et al., 2025-05-28, all 12 categories)
It is deterministic in stub mode (no LLM key required), reproducible by seed, and emits SARIF + PDF + HTML + JSON.
Install
pip install agent-guardian
Requires Python 3.11–3.13 on Linux or macOS. Apache-2.0 licensed.
Heads up: default
python3on a current macOS box is 3.14, which AgentGuardian does not yet target. Ifpip install agent-guardianerrors withNo matching distribution found, install on Python 3.13 instead:python3.13 -m venv .venv && source .venv/bin/activate && pip install agent-guardian. The pinned-3.11 Docker image and theagentguardian-scanGitHub Action are insulated from this — only ad-hocpip installis affected. Tracked for 3.14 support: see ROADMAP.md.
How it compares
| Tool | Multi-agent swarm | Agentic-AI focus | Standards alignment | License |
|---|---|---|---|---|
| PyRIT | no | no | PyRIT risk taxonomy | MIT |
| garak | no | no | own taxonomy | Apache-2.0 |
| Promptfoo | no | no | OWASP LLM Top 10 + ATLAS + EU AI Act | MIT |
| Inspect | no | no | own taxonomy | MIT |
| DeepTeam | no | no | OWASP LLM Top 10 | Apache-2.0 |
| AgentGuardian | yes | yes | OWASP ASI 2026 + MITRE ATLAS v5.4.0 + CSA | Apache-2.0 |
Coverage by OWASP ASI 2026 category
All 96 attack probes are distributed across the ten OWASP ASI 2026 categories below. Each finding is triple-tagged with its ASI, MITRE ATLAS, and CSA Agentic-RT identifiers.
- ASI01 — Memory Poisoning
- ASI02 — Tool Misuse
- ASI03 — Privilege Compromise
- ASI04 — Supply Chain
- ASI05 — Code Execution
- ASI06 — Intent Breaking & Goal Manipulation
- ASI07 — Agent-to-Agent Compromise
- ASI08 — Cascading Failures
- ASI09 — Trust Exploitation
- ASI10 — Rogue Agents (drift)
Enumerate locally with agent-guardian list-probes; full catalogue lives in docs/attacks/overview.mdx.
60-second quickstart
# 1. Sanity check the install (no API key, no network)
agent-guardian doctor
# 2. List the 96 shipped probes
agent-guardian list-probes
# 3. Run an offline scan against the built-in stub target
agent-guardian scan --target stub --mode fast --llm stub
# 4. Open the HTML report
open reports/latest/report.html
Stub mode requires no LLM API key, no network, no environment variables — it uses canned deterministic responses so you can verify the toolchain end-to-end before pointing it at a real target.
Scan a real agent
# Against an HTTP endpoint (any framework, any language)
agent-guardian scan \
--target http://localhost:8000/chat \
--framework http \
--mode smart \
--llm openai \
--fail-under 80
# Against a LangGraph app
agent-guardian scan \
--target ./my_graph.py:graph \
--framework langgraph \
--mode full
Exit code is non-zero if the posture score falls below --fail-under, so the same command works inside CI.
Scan modes
| Mode | What it runs | Typical wall time |
|---|---|---|
fast |
High-signal probe subset, single attacker per family | ~2 min |
smart |
Curated coverage with adaptive attacker selection | ~10 min |
full |
Every probe, every applicable attacker, full mutation set | 30+ min |
Default mode is full. Pick fast for pre-commit / PR checks, smart for nightly runs.
Framework adapters
Shipped first-class adapters (pluggable via --framework):
langgraph— LangGraph state graphscrewai— CrewAI crewsopenai-agents— OpenAI Agents SDKautogen— Microsoft AutoGenadk— Google ADKstrands— AWS Strandshttp— any HTTP/JSON endpoint (works for FastAPI, Flask, Express, anything)
MCP servers and RAG pipelines are covered via the http adapter and worked examples under examples/ (examples/mcp_server, examples/rag_app, examples/fastapi_chatbot).
Attacker swarm
The core swarm contains 11 attacker agents, each scoped to a distinct family of agent-stack failure modes:
| Agent | Targets |
|---|---|
recon-agent |
Surface mapping, tool discovery |
goal-hijack-agent |
Goal redirection, system-prompt override |
tool-abuse-agent |
Tool misuse, argument injection |
privilege-agent |
Privilege escalation, role confusion |
supply-chain-agent |
Tool/model/data supply-chain attacks |
code-exec-agent |
Sandbox escape, code execution |
memory-poison-agent |
Long-term memory poisoning |
a2a-agent |
Agent-to-agent trust exploits |
cascade-agent |
Cascading hallucination / cross-agent contagion |
trust-exploit-agent |
Operator/system trust boundary abuse |
drift-agent |
Behavioural drift, policy erosion over conversation |
Additional specialist classes (FuzzingAgent, OutputHandlingAgent, DenialOfWalletAgent, DetectionEvasionAgent, SecretExtractionAgent, IdentityLeakAgent, CriticAgent) ship as building blocks for custom swarms and are documented under docs/concepts/adversarial-swarm.mdx.
Reports & evidence
Every scan produces a timestamped bundle under reports/<run-id>/:
report.html— interactive dashboard, drillable per-probereport.pdf— print-ready evidence (ReportLab)report.sarif— SARIF 2.1.0 for GitHub Code Scanning / Defender / Snyk ingestreport.json— full machine-readable recordevidence/— per-probe transcripts, prompts, responses, and verdicts
A sample HTML report lives at docs/_assets/sample-report.html.
Signing: every
scan.jsonalready ships Ed25519 + HMAC-SHA256 signatures verifiable viaagent-guardian verify <scan.json>. Sigstore-backed signing of the full evidence bundle is planned for v1.1, not shipped in 1.0.0; theoutput.sign_evidenceconfig flag is accepted for forward compatibility but is a no-op today (a deprecation warning prints on config load if it is set). Until v1.1 the bundle ships unsigned; the SARIF / JSON / PDF are deterministic and hash-stable for external signing.
Local dashboard
agent-guardian serve
# → http://localhost:7474
Browse historical runs, diff posture scores across releases, and download evidence bundles.
CI integration
# .github/workflows/agent-guardian.yml
- uses: actions/setup-python@v5
with: { python-version: '3.11' }
- run: pip install agent-guardian
- run: agent-guardian scan --target ./agent.py:app --mode smart --fail-under 80 --output sarif
- uses: github/codeql-action/upload-sarif@v3
with: { sarif_file: reports/latest/report.sarif }
Worked examples under examples/ci/.
Privacy & telemetry
No telemetry is collected. TelemetryConfig.enabled defaults to False. There is no phone-home, no analytics ping, no install tracker. Stub mode additionally requires no network access at all.
Standards mappings
Each probe is tagged with its ASI category, ATLAS technique, and CSA category. Run:
agent-guardian list-probes --by-standard owasp-asi
agent-guardian list-probes --by-standard mitre-atlas
agent-guardian list-probes --by-standard csa-agentic-rt
The honest, auto-generated coverage table lives at docs/reference/framework-coverage-matrix.md — it lists every ATLAS technique the shipped corpus actually cites, and marks zero-coverage CSA categories explicitly rather than hiding them.
Docs
- Quickstart:
docs/quickstart.mdx - Attack catalogue:
docs/attacks/overview.mdx - Adapter guides:
docs/build-with/ - CLI reference:
docs/reference/cli.md - Hosted docs (preview): agentguardian.io
Project status
AgentGuardian 1.0.0 is the first stable release. Semver applies to: the public Python API, the CLI surface, the SARIF / JSON report schemas, and the probe IDs. Probe content (prompts, scoring) may evolve within a minor version.
See ROADMAP.md for what is next, CHANGELOG.md for what shipped, and governance.md for how decisions are made.
Contributing
We welcome new probes, new adapters, and new attacker classes. Start with CONTRIBUTING.md and the good first issue label.
All commits must be DCO-signed (git commit -s). The pre-commit hook will block unsigned commits.
By participating you agree to the Code of Conduct and the Ethics Policy — AgentGuardian is for testing systems you own or are authorised to test.
Community
Join us on Discord for real-time discussion — probe and adapter design, informal Q&A, and roadmap chat. For long-form questions, the full channel matrix lives at docs/community/support (GitHub Discussions is not enabled on this repo today).
Security
To report a vulnerability, see SECURITY.md. Please do not open public issues for security reports.
License
Apache-2.0. See LICENSE and NOTICE.
AgentGuardian is a trademark of Glacien Technologies — see TRADEMARKS.md for usage guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_guardian-1.0.0rc7.tar.gz.
File metadata
- Download URL: agent_guardian-1.0.0rc7.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
136a25ba04dd5c6eefa6dae7a7b378923cab7635c00d7f2428949a7fdfad6347
|
|
| MD5 |
21e4d2ef3f726955c08c0d954b9ab1d3
|
|
| BLAKE2b-256 |
e1db184b1a4826ae6257e9236baf29900148205b3a388d05b23d674e8e067380
|
Provenance
The following attestation bundles were made for agent_guardian-1.0.0rc7.tar.gz:
Publisher:
publish.yml on glacien-technologies/agent-guardian
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_guardian-1.0.0rc7.tar.gz -
Subject digest:
136a25ba04dd5c6eefa6dae7a7b378923cab7635c00d7f2428949a7fdfad6347 - Sigstore transparency entry: 1714311042
- Sigstore integration time:
-
Permalink:
glacien-technologies/agent-guardian@cf55df44cd22367738425acb23cba11556eacafd -
Branch / Tag:
refs/tags/v1.0.0rc7 - Owner: https://github.com/glacien-technologies
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cf55df44cd22367738425acb23cba11556eacafd -
Trigger Event:
push
-
Statement type:
File details
Details for the file agent_guardian-1.0.0rc7-py3-none-any.whl.
File metadata
- Download URL: agent_guardian-1.0.0rc7-py3-none-any.whl
- Upload date:
- Size: 1.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86a24fd3ede13c4d48792d44aa05def62b6a387cbd095a7d98489dda124f94eb
|
|
| MD5 |
6b7f05747d6a958762a3bc103078b375
|
|
| BLAKE2b-256 |
cc8b8d299e6f8631cb3c34b66a5cddc830b23dfb3b380e30b23ee6f116b96209
|
Provenance
The following attestation bundles were made for agent_guardian-1.0.0rc7-py3-none-any.whl:
Publisher:
publish.yml on glacien-technologies/agent-guardian
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_guardian-1.0.0rc7-py3-none-any.whl -
Subject digest:
86a24fd3ede13c4d48792d44aa05def62b6a387cbd095a7d98489dda124f94eb - Sigstore transparency entry: 1714311690
- Sigstore integration time:
-
Permalink:
glacien-technologies/agent-guardian@cf55df44cd22367738425acb23cba11556eacafd -
Branch / Tag:
refs/tags/v1.0.0rc7 - Owner: https://github.com/glacien-technologies
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cf55df44cd22367738425acb23cba11556eacafd -
Trigger Event:
push
-
Statement type: