Skip to main content

Open-source red teaming toolkit for AI agents, RAG systems, MCP servers, and tool-using LLM applications.

Project description

AgentGuardian

Red-team your AI agents before attackers do.

PyPI Python License CI OpenSSF Scorecard

Docs

Docs · Quickstart · Try the demo agent · Attack library · CI/CD · Sample report


AgentGuardian is an open-source red-teaming toolkit for AI agents. It scans your agent, maps the attack surface, runs the relevant adversarial agents, and generates evidence-backed findings for you to review — and fix the vulnerabilities before they reach production.

AgentGuardian recon, OWASP ASI probe generation, findings, reports, and fix-rerun loop

Watch the demo to see how AgentGuardian finds vulnerabilities in a live scan.

Getting started

1. Install

pip install agent-guardian

or

uv tool install agent-guardian

2. Configure a model provider

AgentGuardian drives its attacks with an LLM. Export a key for your provider — Gemini, OpenAI, or Anthropic:

export GEMINI_API_KEY=...        # or OPENAI_API_KEY / ANTHROPIC_API_KEY

For every supported provider and the full set of configuration options, see the configuration guide.

3. Scan an agent

No agent of your own yet? Point it at the hosted demo target — a deliberately vulnerable "finbot" banking agent:

agent-guardian scan \
  --endpoint https://agent-guardian-testbench-u6tm6gzysq-uc.a.run.app/finbot/chat \
  --model gemini:gemini-3.5-flash \
  --mode fast \
  --output pdf --output-path report.pdf

To scan your own agent instead, swap --endpoint for any target — a hosted URL or a --system-prompt file (see What you can scan).

4. Review the findings

AgentGuardian opens a live dashboard while it runs (http://127.0.0.1:7474) and writes an evidence bundle — findings, transcripts, and your PDF report — under ~/.agentguardian/scans/<scan-id>/.

What you can scan

Scan an HTTP agent

agent-guardian scan \
  --endpoint http://localhost:8000/chat \
  --model gemini:gemini-3.5-flash \
  --mode smart

Scan a system prompt

agent-guardian scan \
  --system-prompt ./prompts/customer-support-agent.txt \
  --model gemini:gemini-3.5-flash \
  --mode fast

Scan an in-process agent

agent-guardian scan my_app.agent:agent \
  --model gemini:gemini-3.5-flash \
  --mode smart

Point AgentGuardian at any importable Python callable or agent object (module:attr) and it runs in-process — useful for pre-deploy and CI, with nothing to host.

Roadmap — white-box agentic detection. Today's scans are black-box: AgentGuardian drives the agent adversarially and detects compromise from what is observable (the response, returned data, and any tool calls the API exposes) across the full OWASP ASI taxonomy. Framework-native modes (LangGraph, CrewAI, AutoGen, OpenAI Agents, ADK, Strands) and OpenTelemetry trace correlation are in progress — they will read the agent's own tool/sub-agent traces to catch internal tool-misuse a clean reply can hide. Follow #126.

What AgentGuardian catches

AgentGuardian tests agentic risks that normal prompt scanners miss:

  • Prompt injection and goal hijack
  • Unsafe tool calls and tool chaining
  • Privilege abuse
  • RAG poisoning and indirect prompt injection
  • Memory and context poisoning
  • Sensitive data leakage
  • Agent-to-agent manipulation
  • Cascading failures
  • Trust exploitation and unsafe outputs
  • Goal drift and untraceable behavior

Reports and evidence

Every scan writes a local evidence bundle under ~/.agentguardian/scans/<scan-id>/:

  • scan.json — machine-readable findings, signed (HMAC-SHA256 + Ed25519)
  • events.jsonl — the scan timeline
  • probe/ — per-probe requests, responses, verdicts, and evidence
  • forensic_manifest.json — integrity manifest for the bundle
  • a live local dashboard — browse findings, transcripts, and exports

Generate shareable or CI-ready reports in any format on demand:

agent-guardian report SCAN_ID --output sarif --output-path scan.sarif   # GitHub Security
agent-guardian report SCAN_ID --output md                                # Markdown
agent-guardian report SCAN_ID --output pdf  --output-path report.pdf      # shareable PDF

Formats: json · sarif · junit · md · gitlab · pdf. Stored evidence can be verified with agent-guardian verify.

How it works

Every scan follows the same loop:

Target → surface mapping → adversarial agents → AIVSS-scored findings → evidence bundle

For the full workflow, see how AgentGuardian works.

Scan modes

  • fast — quick local feedback
  • smart — broader coverage for development and pull requests
  • full — release gates and audit evidence

Use full when you need AIVSS-scored findings for CI/CD gates.

Commands

Command What it does
agent-guardian scan Run an adversarial swarm scan against a target
agent-guardian report <id> --output FMT Regenerate a report — json · sarif · junit · md · gitlab · pdf · badge
agent-guardian gate <id> --fail-under N Apply pass/fail thresholds to a stored scan (CI exit codes)
agent-guardian serve Start the local dashboard
agent-guardian scans list / delete List or delete stored scans (delete --older-than 30d for bulk cleanup)
agent-guardian config show / init Inspect the effective config / scaffold a config file
agent-guardian verify <report> Verify the HMAC-SHA256 + Ed25519 signatures on a report
agent-guardian last-score Print the AIVSS of the most recent scan
agent-guardian doctor Verify the install, provider keys, and prerequisites
agent-guardian telemetry status Manage opt-in telemetry (enable / disable)
agent-guardian version Print the installed version

Run any command with --help for its full options, or see the CLI reference.

CI/CD with GitHub Actions

The shipped composite action runs a scan, uploads SARIF to GitHub Code Scanning, and (optionally) posts a summary comment on the pull request:

name: AgentGuardian

on:
  pull_request:
  push:
    branches: [main]

jobs:
  red-team:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write   # upload SARIF to Code Scanning
      pull-requests: write     # post the summary comment
    steps:
      - uses: actions/checkout@v4

      - uses: glacien-technologies/agent-guardian/.github/actions/agentguardian-scan@v1
        with:
          endpoint: http://localhost:8000/chat
          model: gemini:gemini-3.5-flash
          mode: full
          fail-under: "80"
          max-critical: "0"
          comment: "true"
        env:
          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}

The job fails when the gate (fail-under / max-critical) is breached. For GitLab, Bitbucket, raw-CLI, and fleet/nightly setups, see the CI/CD guides — including the parallel suites guide for scanning many agents from one file.

Standards and coverage

AgentGuardian maps its shipped probes to:

  • OWASP Top 10 for Agentic Applications
  • MITRE ATLAS
  • CSA Agentic AI Red Teaming Guide

The exact agents and probes that ran against your target are enumerated in every scan report (coverage in scan.json). The full probe-to-standard mapping lives in the OWASP mapping and the framework coverage matrix.

Privacy & telemetry

Telemetry is opt-in and disabled by default. Out of the box AgentGuardian sends nothing — no analytics ping, no install ping, no scan counts. Telemetry only activates after you explicitly opt in. Once enabled, it sends anonymous operational counts (agents dispatched, attempts, findings) plus a locally generated, anonymous install id (a random UUID stored at ~/.agentguardian/install_id, with no link to your identity).

Manage it any time:

agent-guardian telemetry status     # show current state
agent-guardian telemetry enable      # opt in
agent-guardian telemetry disable     # opt out

AgentGuardian never collects prompts, agent responses, target URLs, headers, secrets, API keys, transcripts, reports, evidence files, tool inputs or outputs, or customer data.

Run from source

To run AgentGuardian from a source checkout instead of the published package:

# clone
git clone https://github.com/glacien-technologies/agent-guardian.git
cd agent-guardian

# virtual environment
python3.11 -m venv .venv
source .venv/bin/activate

# install the checkout in editable mode
pip install -e ".[dev]"

# run it from source
agent-guardian doctor
agent-guardian scan \
  --endpoint http://localhost:8000/chat \
  --model gemini:gemini-3.5-flash \
  --mode fast

For contribution guidelines, see the contribution guide.

Contributing

We welcome new probes, new adapters, and new attacker logic. Start with the contribution guide and the good first issue label.

All commits must be DCO-signed:

git commit -s

By participating you agree to CODE_OF_CONDUCT.md and the ethics policy. AgentGuardian is for testing systems you own or are explicitly authorised to test.

Community

Join us on Discord for quickstart help, probe design, adapter questions, and roadmap discussion. For longer-form support channels, see the support guide.

Security

To report a vulnerability, see SECURITY.md. Do not open public issues for security reports.

License

Apache-2.0. See LICENSE and NOTICE.

AgentGuardian is a trademark of Glacien Technologies. See TRADEMARKS.md for usage guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_guardian-1.0.0rc13.tar.gz (2.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_guardian-1.0.0rc13-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file agent_guardian-1.0.0rc13.tar.gz.

File metadata

  • Download URL: agent_guardian-1.0.0rc13.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for agent_guardian-1.0.0rc13.tar.gz
Algorithm Hash digest
SHA256 ff73f3a34dc3c741a228d4f153bc5e9df498f4dd1a2319b7157e4f18575a4b8f
MD5 d736cd7a1cfe366cabc6a7e7b4b8734a
BLAKE2b-256 57201d537af945fad8541fe505073877aa3bae2e669c40285527710b35a7288c

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_guardian-1.0.0rc13.tar.gz:

Publisher: publish.yml on glacien-technologies/agent-guardian

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agent_guardian-1.0.0rc13-py3-none-any.whl.

File metadata

File hashes

Hashes for agent_guardian-1.0.0rc13-py3-none-any.whl
Algorithm Hash digest
SHA256 3e64093c30d22a67914232cbdd1321fd73f80c34182cea55fe564899eceee64f
MD5 55b5eceba1f9c9bd669346f4fd8f26f6
BLAKE2b-256 31547a911bb038978acfdbac04d13af418b5f57d95eaaa287138cc182cf7bfe2

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_guardian-1.0.0rc13-py3-none-any.whl:

Publisher: publish.yml on glacien-technologies/agent-guardian

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page