Open-source red teaming toolkit for AI agents, RAG systems, MCP servers, and tool-using LLM applications.
Project description
AgentGuardian
Red-team your AI agents before attackers do.
Docs · Quickstart · Try the demo agent · Attack library · CI/CD · Sample report
AgentGuardian is an open-source red-teaming toolkit for AI agents. It scans your agent, maps the attack surface, runs the relevant adversarial agents, and generates evidence-backed findings for you to review — and fix the vulnerabilities before they reach production.
▶ Watch the demo to see how AgentGuardian finds vulnerabilities in a live scan.
Getting started
1. Install
pip install agent-guardian
or
uv tool install agent-guardian
2. Configure a model provider
AgentGuardian drives its attacks with an LLM. Export a key for your provider — Gemini, OpenAI, or Anthropic:
export GEMINI_API_KEY=... # or OPENAI_API_KEY / ANTHROPIC_API_KEY
For every supported provider and the full set of configuration options, see the configuration guide.
3. Scan an agent
No agent of your own yet? Point it at the hosted demo target — a deliberately vulnerable "finbot" banking agent:
agent-guardian scan \
--endpoint https://agent-guardian-testbench-u6tm6gzysq-uc.a.run.app/finbot/chat \
--model gemini:gemini-3.5-flash \
--mode fast \
--output pdf --output-path report.pdf
To scan your own agent instead, swap --endpoint for any target — a hosted URL, a --system-prompt file, or a --framework object (see What you can scan).
4. Review the findings
AgentGuardian opens a live dashboard while it runs (http://127.0.0.1:7474) and writes an evidence bundle — findings, transcripts, and your PDF report — under ~/.agentguardian/scans/<scan-id>/.
What you can scan
Scan an HTTP agent
agent-guardian scan \
--endpoint http://localhost:8000/chat \
--model gemini:gemini-3.5-flash \
--mode smart
Scan a system prompt
agent-guardian scan \
--system-prompt ./prompts/customer-support-agent.txt \
--model gemini:gemini-3.5-flash \
--mode fast
Scan a LangGraph agent
agent-guardian scan \
--framework langgraph \
--framework-ref my_app.graph:graph \
--model gemini:gemini-3.5-flash \
--mode smart
Scan a Google ADK agent
agent-guardian scan \
--framework adk \
--framework-ref my_agent.agent:root_agent \
--model gemini:gemini-3.5-flash \
--mode smart
For more targets (MCP, OpenAPI, WebSocket, browser flows), see the target adapters guide.
What AgentGuardian catches
AgentGuardian tests agentic risks that normal prompt scanners miss:
- Prompt injection and goal hijack
- Unsafe tool calls and tool chaining
- Privilege abuse
- RAG poisoning and indirect prompt injection
- Memory and context poisoning
- Sensitive data leakage
- Agent-to-agent manipulation
- Cascading failures
- Trust exploitation and unsafe outputs
- Goal drift and untraceable behavior
Reports and evidence
Every scan writes a local evidence bundle under ~/.agentguardian/scans/<scan-id>/:
scan.json— machine-readable findings, signed (HMAC-SHA256 + Ed25519)events.jsonl— the scan timelineprobe/— per-probe requests, responses, verdicts, and evidenceforensic_manifest.json— integrity manifest for the bundle- a live local dashboard — browse findings, transcripts, and exports
Generate shareable or CI-ready reports in any format on demand:
agent-guardian report SCAN_ID --output sarif --output-path scan.sarif # GitHub Security
agent-guardian report SCAN_ID --output md # Markdown
agent-guardian report SCAN_ID --output pdf --output-path report.pdf # shareable PDF
Formats: json · sarif · junit · md · gitlab · pdf. Stored evidence can be verified with agent-guardian verify.
How it works
Every scan follows the same loop:
Target → surface mapping → adversarial agents → AIVSS-scored findings → evidence bundle
For the full workflow, see how AgentGuardian works.
Scan modes
fast— quick local feedbacksmart— broader coverage for development and pull requestsfull— release gates and audit evidence
Use full when you need AIVSS-scored findings for CI/CD gates.
Commands
| Command | What it does |
|---|---|
agent-guardian scan |
Run an adversarial swarm scan against a target |
agent-guardian report <id> --output FMT |
Regenerate a report — json · sarif · junit · md · gitlab · pdf · badge |
agent-guardian gate <id> --fail-under N |
Apply pass/fail thresholds to a stored scan (CI exit codes) |
agent-guardian serve |
Start the local dashboard |
agent-guardian scans list / delete |
List or delete stored scans (delete --older-than 30d for bulk cleanup) |
agent-guardian config show / init |
Inspect the effective config / scaffold a config file |
agent-guardian verify <report> |
Verify the HMAC-SHA256 + Ed25519 signatures on a report |
agent-guardian last-score |
Print the AIVSS of the most recent scan |
agent-guardian doctor |
Verify the install, provider keys, and prerequisites |
agent-guardian telemetry status |
Manage opt-in telemetry (enable / disable) |
agent-guardian version |
Print the installed version |
Run any command with --help for its full options, or see the CLI reference.
CI/CD with GitHub Actions
The shipped composite action runs a scan, uploads SARIF to GitHub Code Scanning, and (optionally) posts a summary comment on the pull request:
name: AgentGuardian
on:
pull_request:
push:
branches: [main]
jobs:
red-team:
runs-on: ubuntu-latest
permissions:
contents: read
security-events: write # upload SARIF to Code Scanning
pull-requests: write # post the summary comment
steps:
- uses: actions/checkout@v4
- uses: glacien-technologies/agent-guardian/.github/actions/agentguardian-scan@v1
with:
endpoint: http://localhost:8000/chat
model: gemini:gemini-3.5-flash
mode: full
fail-under: "80"
max-critical: "0"
comment: "true"
env:
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
The job fails when the gate (fail-under / max-critical) is breached. For GitLab, Bitbucket, raw-CLI, and fleet/nightly setups, see the CI/CD guides — including the parallel suites guide for scanning many agents from one file.
Standards and coverage
AgentGuardian maps its shipped probes to:
- OWASP Top 10 for Agentic Applications
- MITRE ATLAS
- CSA Agentic AI Red Teaming Guide
The exact agents and probes that ran against your target are enumerated in every scan report (coverage in scan.json). The full probe-to-standard mapping lives in the OWASP mapping and the framework coverage matrix.
Privacy & telemetry
Telemetry is opt-in and disabled by default. Out of the box AgentGuardian sends nothing — no analytics ping, no install ping, no scan counts. Telemetry only activates after you explicitly opt in. Once enabled, it sends anonymous operational counts (agents dispatched, attempts, findings) plus a locally generated, anonymous install id (a random UUID stored at ~/.agentguardian/install_id, with no link to your identity).
Manage it any time:
agent-guardian telemetry status # show current state
agent-guardian telemetry enable # opt in
agent-guardian telemetry disable # opt out
AgentGuardian never collects prompts, agent responses, target URLs, headers, secrets, API keys, transcripts, reports, evidence files, tool inputs or outputs, or customer data.
Run from source
To run AgentGuardian from a source checkout instead of the published package:
# clone
git clone https://github.com/glacien-technologies/agent-guardian.git
cd agent-guardian
# virtual environment
python3.11 -m venv .venv
source .venv/bin/activate
# install the checkout in editable mode
pip install -e ".[dev]"
# run it from source
agent-guardian doctor
agent-guardian scan \
--endpoint http://localhost:8000/chat \
--model gemini:gemini-3.5-flash \
--mode fast
For contribution guidelines, see the contribution guide.
Contributing
We welcome new probes, new adapters, and new attacker logic. Start with the contribution guide and the good first issue label.
All commits must be DCO-signed:
git commit -s
By participating you agree to CODE_OF_CONDUCT.md and the ethics policy. AgentGuardian is for testing systems you own or are explicitly authorised to test.
Community
Join us on Discord for quickstart help, probe design, adapter questions, and roadmap discussion. For longer-form support channels, see the support guide.
Security
To report a vulnerability, see SECURITY.md. Do not open public issues for security reports.
License
Apache-2.0. See LICENSE and NOTICE.
AgentGuardian is a trademark of Glacien Technologies. See TRADEMARKS.md for usage guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_guardian-1.0.0rc11.tar.gz.
File metadata
- Download URL: agent_guardian-1.0.0rc11.tar.gz
- Upload date:
- Size: 2.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c936f51f2cc6872c4e458a25c6016333c9696dec362432d27eb1f649db6368fd
|
|
| MD5 |
3cf93bb0b892173abb630abfa4e8b679
|
|
| BLAKE2b-256 |
42d0083cc9b1a122e37ddb2f5f9ef579f3572ab7bd7c219760e3537c58c89b02
|
Provenance
The following attestation bundles were made for agent_guardian-1.0.0rc11.tar.gz:
Publisher:
publish.yml on glacien-technologies/agent-guardian
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_guardian-1.0.0rc11.tar.gz -
Subject digest:
c936f51f2cc6872c4e458a25c6016333c9696dec362432d27eb1f649db6368fd - Sigstore transparency entry: 1760644628
- Sigstore integration time:
-
Permalink:
glacien-technologies/agent-guardian@21d70afe856b855a1224eef0b0b664f9a8395c7e -
Branch / Tag:
refs/tags/v1.0.0rc11 - Owner: https://github.com/glacien-technologies
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@21d70afe856b855a1224eef0b0b664f9a8395c7e -
Trigger Event:
push
-
Statement type:
File details
Details for the file agent_guardian-1.0.0rc11-py3-none-any.whl.
File metadata
- Download URL: agent_guardian-1.0.0rc11-py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1136c81337710e37211802e61a69e81e95b9c3a0f73d40dcd8983560df38128c
|
|
| MD5 |
0f230c78413da94cc516ddc8501e0c35
|
|
| BLAKE2b-256 |
bd755e529cc77e46a4bce769bd110a91eca07c937c860687328b3c69e65df733
|
Provenance
The following attestation bundles were made for agent_guardian-1.0.0rc11-py3-none-any.whl:
Publisher:
publish.yml on glacien-technologies/agent-guardian
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_guardian-1.0.0rc11-py3-none-any.whl -
Subject digest:
1136c81337710e37211802e61a69e81e95b9c3a0f73d40dcd8983560df38128c - Sigstore transparency entry: 1760644824
- Sigstore integration time:
-
Permalink:
glacien-technologies/agent-guardian@21d70afe856b855a1224eef0b0b664f9a8395c7e -
Branch / Tag:
refs/tags/v1.0.0rc11 - Owner: https://github.com/glacien-technologies
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@21d70afe856b855a1224eef0b0b664f9a8395c7e -
Trigger Event:
push
-
Statement type: