Open-source red teaming toolkit for AI agents, RAG systems, MCP servers, and tool-using LLM applications.

These details have not been verified by PyPI

Project links

Project description

AgentGuardian

Open-source red-team testing toolkit for agentic AI systems.

96 attack probes · 11 attacker agents · OWASP ASI 2026, MITRE ATLAS v5.4.0, and CSA Agentic-RT mappings · SARIF + PDF reports · runs offline.

What it is

AgentGuardian is a testing toolkit that runs adversarial probes against your agent — LangGraph, CrewAI, OpenAI Agents SDK, AutoGen, ADK, Strands, or any HTTP endpoint — and produces a signed-style evidence bundle you can hand to security review.

It ships 96 attack probes organised against three public taxonomies:

OWASP Top 10 for Agentic Applications 2026 (ASI01–ASI10)
MITRE ATLAS v5.4.0 (February 2026 release)
CSA Agentic AI Red Teaming Guide (Huang et al., 2025-05-28, 12 categories)

It is deterministic in stub mode (no LLM key required), reproducible by seed, and emits SARIF + PDF + HTML + JSON.

Install

pip install agent-guardian

Requires Python 3.10+. Apache-2.0 licensed.

How it compares

Tool	Multi-agent swarm	Agentic-AI focus	Standards alignment	License
PyRIT	no	no	NIST AI RMF (partial)	MIT
garak	no	no	own taxonomy	Apache-2.0
Promptfoo	no	no	OWASP LLM Top 10 + ATLAS + EU AI Act	MIT
Inspect	no	no	own taxonomy	MIT
DeepTeam	no	no	OWASP LLM Top 10	Apache-2.0
AgentGuardian	yes	yes	OWASP ASI 2026 + MITRE ATLAS v5.4.0 + CSA	Apache-2.0

Coverage by OWASP ASI 2026 category

All 96 attack probes are distributed across the ten OWASP ASI 2026 categories below. Each finding is triple-tagged with its ASI, MITRE ATLAS, and CSA Agentic-RT identifiers.

ASI01 — Memory Poisoning
ASI02 — Tool Misuse
ASI03 — Privilege Compromise
ASI04 — Supply Chain
ASI05 — Code Execution
ASI06 — Intent Breaking & Goal Manipulation
ASI07 — Agent-to-Agent Compromise
ASI08 — Cascading Failures
ASI09 — Trust Exploitation
ASI10 — Rogue Agents (drift)

Enumerate locally with agent-guardian list-probes; full catalogue at agentguardian.io/attacks/overview.

60-second quickstart

# 1. Sanity check the install (no API key, no network)
agent-guardian doctor

# 2. List the 96 shipped probes
agent-guardian list-probes

# 3. Run an offline scan against the built-in stub target
agent-guardian scan --target stub --mode fast --llm stub

# 4. Open the HTML report
open reports/latest/report.html

Stub mode requires no LLM API key, no network, no environment variables — it uses canned deterministic responses so you can verify the toolchain end-to-end before pointing it at a real target.

Scan a real agent

# Against an HTTP endpoint (any framework, any language)
agent-guardian scan \
  --target http://localhost:8000/chat \
  --framework http \
  --mode smart \
  --llm openai \
  --fail-under 80

# Against a LangGraph app
agent-guardian scan \
  --target ./my_graph.py:graph \
  --framework langgraph \
  --mode full

Exit code is non-zero if the posture score falls below --fail-under, so the same command works inside CI.

Scan modes

Mode	What it runs	Typical wall time
`fast`	High-signal probe subset, single attacker per family	~2 min
`smart`	Curated coverage with adaptive attacker selection	~10 min
`full`	Every probe, every applicable attacker, full mutation set	30+ min

Default mode is full. Pick fast for pre-commit / PR checks, smart for nightly runs.

Framework adapters

Shipped first-class adapters (pluggable via --framework):

langgraph — LangGraph state graphs
crewai — CrewAI crews
openai-agents — OpenAI Agents SDK
autogen — Microsoft AutoGen
adk — Google ADK
strands — AWS Strands
http — any HTTP/JSON endpoint (works for FastAPI, Flask, Express, anything)

MCP servers and RAG pipelines are covered via the http adapter and worked examples under examples/ (examples/mcp_server, examples/rag_app, examples/fastapi_chatbot).

Attacker swarm

The core swarm contains 11 attacker agents, each scoped to a distinct family of agent-stack failure modes:

Agent	Targets
`recon-agent`	Surface mapping, tool discovery
`goal-hijack-agent`	Goal redirection, system-prompt override
`tool-abuse-agent`	Tool misuse, argument injection
`privilege-agent`	Privilege escalation, role confusion
`supply-chain-agent`	Tool/model/data supply-chain attacks
`code-exec-agent`	Sandbox escape, code execution
`memory-poison-agent`	Long-term memory poisoning
`a2a-agent`	Agent-to-agent trust exploits
`cascade-agent`	Cascading hallucination / cross-agent contagion
`trust-exploit-agent`	Operator/system trust boundary abuse
`drift-agent`	Behavioural drift, policy erosion over conversation

Additional specialist classes (FuzzingAgent, OutputHandlingAgent, DenialOfWalletAgent, DetectionEvasionAgent, SecretExtractionAgent, IdentityLeakAgent, CriticAgent) ship as building blocks for custom swarms and are documented under agentguardian.io/attackers.

Reports & evidence

Every scan produces a timestamped bundle under reports/<run-id>/:

report.html — interactive dashboard, drillable per-probe
report.pdf — print-ready evidence (ReportLab)
report.sarif — SARIF 2.1.0 for GitHub Code Scanning / Defender / Snyk ingest
report.json — full machine-readable record
evidence/ — per-probe transcripts, prompts, responses, and verdicts

A sample HTML report lives at docs/_assets/sample-report.html.

Signing: the sign_evidence config flag is wired but Sigstore-backed signing of the evidence bundle is planned, not shipped in 1.0.0. Until then the bundle ships unsigned; the SARIF / JSON / PDF are deterministic and hash-stable for external signing.

Local dashboard

agent-guardian serve
# → http://localhost:7474

Browse historical runs, diff posture scores across releases, and download evidence bundles.

CI integration

# .github/workflows/agent-guardian.yml
- uses: actions/setup-python@v5
  with: { python-version: '3.11' }
- run: pip install agent-guardian
- run: agent-guardian scan --target ./agent.py:app --mode smart --fail-under 80 --output sarif
- uses: github/codeql-action/upload-sarif@v3
  with: { sarif_file: reports/latest/report.sarif }

Worked examples under examples/ci/.

Privacy & telemetry

No telemetry is collected. TelemetryConfig.enabled defaults to False. There is no phone-home, no analytics ping, no install tracker. Stub mode additionally requires no network access at all.

Standards mappings

Each probe is tagged with its ASI category, ATLAS technique, and CSA category. Run:

agent-guardian list-probes --by-standard owasp-asi
agent-guardian list-probes --by-standard mitre-atlas
agent-guardian list-probes --by-standard csa-agentic-rt

Full mapping tables: agentguardian.io/standards.

Docs

Quickstart: agentguardian.io/quickstart
Attack catalogue: agentguardian.io/attacks/overview
Adapter guides: agentguardian.io/adapters
CLI reference: agentguardian.io/cli

Project status

AgentGuardian 1.0.0 is the first stable release. Semver applies to: the public Python API, the CLI surface, the SARIF / JSON report schemas, and the probe IDs. Probe content (prompts, scoring) may evolve within a minor version.

See ROADMAP.md for what is next, CHANGELOG.md for what shipped, and governance.md for how decisions are made.

Contributing

We welcome new probes, new adapters, and new attacker classes. Start with CONTRIBUTING.md and the good first issue label.

All commits must be DCO-signed (git commit -s). The pre-commit hook will block unsigned commits.

By participating you agree to the Code of Conduct and the Ethics Policy — AgentGuardian is for testing systems you own or are authorised to test.

Security

To report a vulnerability, see SECURITY.md. Please do not open public issues for security reports.

License

Apache-2.0. See LICENSE and NOTICE.

AgentGuardian is a trademark of Glacien Technologies — see TRADEMARKS.md for usage guidelines.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.0rc8 pre-release

Jun 5, 2026

1.0.0rc7 pre-release

Jun 4, 2026

1.0.0rc6 pre-release

Jun 4, 2026

1.0.0rc5 pre-release

Jun 3, 2026

1.0.0rc4 pre-release

Jun 3, 2026

1.0.0rc3 pre-release

Jun 2, 2026

This version

1.0.0rc2 pre-release

Jun 1, 2026

1.0.0rc1 pre-release

May 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_guardian-1.0.0rc2.tar.gz (1.6 MB view details)

Uploaded Jun 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_guardian-1.0.0rc2-py3-none-any.whl (1.3 MB view details)

Uploaded Jun 1, 2026 Python 3

File details

Details for the file agent_guardian-1.0.0rc2.tar.gz.

File metadata

Download URL: agent_guardian-1.0.0rc2.tar.gz
Upload date: Jun 1, 2026
Size: 1.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for agent_guardian-1.0.0rc2.tar.gz
Algorithm	Hash digest
SHA256	`e52e4dbca8a284c8675db6c0f5c5dc129caf8ed0260f6805ac6a383677940cbf`
MD5	`42f8065ada437a8fad76e29c5b59e923`
BLAKE2b-256	`06b79754684ea1b8626b1bdb0f41a15e4b9507bd24294827c6ba24a3d60bfa2c`

See more details on using hashes here.

File details

Details for the file agent_guardian-1.0.0rc2-py3-none-any.whl.

File metadata

Download URL: agent_guardian-1.0.0rc2-py3-none-any.whl
Upload date: Jun 1, 2026
Size: 1.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for agent_guardian-1.0.0rc2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`15da5cce8bc103f68d4be8636e8d69df02aa4756d46d801868436639ec6e26bf`
MD5	`f98066deddcf4c858a1b809b6bcdae1e`
BLAKE2b-256	`54bb42cf70681a9065d8dff20b27a3d7b8e0f330a21c2b8a8588410afd2f49fc`

See more details on using hashes here.

agent-guardian 1.0.0rc2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AgentGuardian

What it is

Install

How it compares

Coverage by OWASP ASI 2026 category

60-second quickstart

Scan a real agent

Scan modes

Framework adapters

Attacker swarm

Reports & evidence

Local dashboard

CI integration

Privacy & telemetry

Standards mappings

Docs

Project status

Contributing

Security

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes