Open-source red teaming toolkit for AI agents, RAG systems, MCP servers, and tool-using LLM applications.

These details have not been verified by PyPI

Project description

AgentGuardian

Red-team your AI agents before attackers do.

Open-source, local-first adversarial security testing for AI agents, RAG systems, MCP servers, and tool-using LLM applications.

Docs · Quickstart · Try the demo agent · Attack library · CI/CD · Sample report

AgentGuardian points a swarm of adversarial probes at your target and gives you reproducible evidence you can use in engineering, security review, and CI: AIVSS scoring, signed JSON, SARIF, Markdown, JUnit, PDF, and per-probe transcripts.

Built for agentic systems, not just single-prompt chatbot evals.
Finds prompt injection, tool misuse, privilege abuse, memory poisoning, code-exec paths, trust exploits, and goal drift.
Runs locally, in CI, or offline in deterministic --model stub mode.

Demo

pip install agent-guardian
echo 'You are a helpful customer-support agent for ACME Bank.' > prompt.txt
agent-guardian scan --system-prompt prompt.txt --mode fast --model stub

That gives you:

a live local dashboard during interactive scans
a stored scan artifact under ~/.agentguardian/scans/<scan_id>/
exportable reports via agent-guardian report SCAN_ID --output sarif --output-path scan.sarif
a static reference rendering: docs/_assets/sample-report.html

--model stub requires no API key and no network. Swap in a real model such as gemini:gemini-2.5-flash or openai:gpt-4o when you want an authoritative judge.

Install

# pip
pip install agent-guardian

# pipx
pipx install agent-guardian

# uv
uv add agent-guardian
# or
uv tool install agent-guardian

Python 3.11–3.13 are supported. Linux and macOS are first-class; Windows is community-supported.

Heads up: current macOS often defaults python3 to 3.14, which AgentGuardian does not yet target. If pip install agent-guardian fails with No matching distribution found, use Python 3.13 instead:
python3.13 -m venv .venv
source .venv/bin/activate
pip install agent-guardian
Docker and the GitHub Action path are insulated from this. See docs/installation.mdx for the full install matrix.

60-second quickstart

# 1. Sanity-check the install
agent-guardian doctor

# 2. See the shipped probe corpus
agent-guardian list-probes

# 3. Run an offline scan
echo 'You are a helpful customer-support agent for ACME Bank.' > prompt.txt
agent-guardian scan --system-prompt prompt.txt --mode fast --model stub

# 4. Export a machine-readable report once you have a scan id
agent-guardian report SCAN_ID --output sarif --output-path scan.sarif

Interactive scans auto-serve a local dashboard. You can also browse results later with:

agent-guardian serve
# → http://127.0.0.1:7474

What you can scan

Prompt-only targets via --system-prompt PATH
Hosted HTTP agents via --endpoint URL
Framework-native objects via --framework KIND --framework-ref MODULE:ATTR
Custom Python entrypoints via the positional target argument (my_agent:run, path/to/app.py:graph)
Advanced contract-driven targets including MCP / OpenAPI / browser / WebSocket flows via the contract path documented in docs/concepts/target-adapters.mdx

Built-in framework kinds:

langgraph
crewai
openai_agents
autogen
adk
strands

What it catches

AgentGuardian ships 96 attack probes covering all ten OWASP Top 10 for Agentic Applications 2026 categories:

ASI01 — prompt injection / goal hijack
ASI02 — tool misuse
ASI03 — privilege compromise
ASI04 — supply chain / resource overload
ASI05 — code execution
ASI06 — memory poisoning
ASI07 — agent-to-agent compromise
ASI08 — cascading failures
ASI09 — trust exploitation / unsafe output handling
ASI10 — untraceability / goal drift

The corpus is also mapped to MITRE ATLAS v5.4.0 and the CSA Agentic AI Red Teaming Guide. See the exact set in docs/reference/framework-coverage-matrix.md.

What you get

Signed JSON evidence — scan.json ships with HMAC-SHA256 + Ed25519 signatures verifiable with agent-guardian verify
Exportable reports — json, sarif, junit, md, gitlab, and pdf
Per-probe transcripts — prompts, responses, verdicts, and evidence trails
AIVSS scoring — publishable in --mode full; trend-tracking in fast and smart
Local dashboard — browse historical scans, findings, and evidence bundles

To verify a stored report:

agent-guardian verify ~/.agentguardian/scans/SCAN_ID/scan.json

Why AgentGuardian

Agent-first — built for tool-using, stateful, multi-step systems rather than single-turn prompt checks
Recon before attack — fingerprints the target surface and then runs only the relevant specialists
Evidence over vibes — reports are grounded in transcripts, structured findings, and signed artifacts
Local-first — no telemetry, no phone-home, and a fully offline stub-mode path
CI-ready — non-zero exit codes, SARIF export, and reusable GitHub Action patterns

For a deeper competitive breakdown, see docs/concepts/agent-guardian-vs.mdx.

How it works

Every scan follows the same narrative:

Plan — resolve the target type, budgets, models, and output format
Recon — black-box fingerprint the target: tools, memory, PII exposure, multi-agent handoffs, reachable systems
Red Teaming — dispatch ASI-aligned specialists against the observed surface
Findings — judge outcomes, compute AIVSS, and write signed reports

The recon fingerprint is the key difference: AgentGuardian decides which attacks matter for this target before it spends budget on them.

Scan a real target

# Hosted HTTP endpoint
agent-guardian scan \
  --endpoint http://localhost:8000/chat \
  --model gemini:gemini-2.5-flash \
  --mode smart

# LangGraph app
agent-guardian scan \
  --framework langgraph \
  --framework-ref my_app.graph:graph \
  --model gemini:gemini-2.5-flash

# Custom Python entrypoint
agent-guardian scan \
  my_agent:run \
  --model gemini:gemini-2.5-flash

Worked examples live under examples/ and the Try AgentGuardian guides under docs/try/.

Scan modes

Mode	Typical use	Notes
`fast`	Dev loop, smoke checks, pre-push	Lowest cost and quickest feedback
`smart`	PR iteration, broader nightly coverage	Better signal than `fast`, still non-authoritative
`full`	Release gates, CI on `main`, audit evidence	Authoritative mode for AIVSS and `--fail-under`

Only --mode full produces an authoritative AIVSS suitable for hard release gating.

CI integration

name: AgentGuardian
on: [pull_request]

jobs:
  scan:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - run: pip install agent-guardian
      - name: Red-team the agent
        env:
          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
        run: |
          agent-guardian scan \
            --endpoint http://localhost:8000/chat \
            --model gemini:gemini-2.5-flash \
            --mode full \
            --output sarif \
            --output-path agentguardian.sarif \
            --fail-under 80
      - uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: agentguardian.sarif

See docs/ci-cd/overview.mdx and docs/ci-cd/github-actions.mdx for the fuller setup, thresholds, and composite-action path.

Standards and coverage

Enumerate the corpus locally:

agent-guardian list-probes
agent-guardian list-probes --by-standard owasp-asi
agent-guardian list-probes --by-standard mitre-atlas
agent-guardian list-probes --by-standard csa-agentic-rt

Coverage today:

OWASP ASI 2026 — all 10 categories covered
MITRE ATLAS v5.4.0 — mapped where black-box agent testing can observe the technique at the target I/O surface
CSA Agentic AI Red Teaming Guide — mapped across the shipped corpus

The exact probe-to-standard mapping lives in docs/reports/owasp-mapping.mdx and docs/reference/framework-coverage-matrix.md.

Privacy & telemetry

No telemetry is collected. There is no analytics ping, install tracker, or phone-home path. Stub mode additionally works offline with no LLM key.

Docs

Project status

AgentGuardian 1.0.0 is the first stable release. Semantic versioning applies to the public Python API, CLI surface, report schemas, and probe IDs. Probe content and scoring may evolve within a minor release as coverage improves.

See ROADMAP.md, CHANGELOG.md, and governance.md.

Contributing

We welcome new probes, new adapters, and new attacker logic. Start with CONTRIBUTING.md and the good first issue label.

All commits must be DCO-signed:

git commit -s

By participating you agree to CODE_OF_CONDUCT.md and ETHICS.md. AgentGuardian is for testing systems you own or are explicitly authorised to test.

Community

Join us on Discord for probe design, adapter questions, and roadmap discussion. For longer-form support channels, see docs/community/support.mdx.

Security

To report a vulnerability, see SECURITY.md. Do not open public issues for security reports.

License

Apache-2.0. See LICENSE and NOTICE.

AgentGuardian is a trademark of Glacien Technologies. See TRADEMARKS.md for usage guidelines.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0rc8 pre-release

Jun 5, 2026

1.0.0rc7 pre-release

Jun 4, 2026

1.0.0rc6 pre-release

Jun 4, 2026

1.0.0rc5 pre-release

Jun 3, 2026

1.0.0rc4 pre-release

Jun 3, 2026

1.0.0rc3 pre-release

Jun 2, 2026

1.0.0rc2 pre-release

Jun 1, 2026

1.0.0rc1 pre-release

May 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_guardian-1.0.0rc8.tar.gz (2.0 MB view details)

Uploaded Jun 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_guardian-1.0.0rc8-py3-none-any.whl (1.6 MB view details)

Uploaded Jun 5, 2026 Python 3

File details

Details for the file agent_guardian-1.0.0rc8.tar.gz.

File metadata

Download URL: agent_guardian-1.0.0rc8.tar.gz
Upload date: Jun 5, 2026
Size: 2.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for agent_guardian-1.0.0rc8.tar.gz
Algorithm	Hash digest
SHA256	`08a520ed25776fcb3493dcfc8358e55b69f72f5ccca50b00406ca72ad247dd01`
MD5	`317647a2fb5ab7c3de9852d8b0a46aac`
BLAKE2b-256	`ce5e200dcdc95d48752886f6c6f1130482fa1bf03a32a90bc03c0cac982747b9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_guardian-1.0.0rc8.tar.gz:

Publisher: publish.yml on glacien-technologies/agent-guardian

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_guardian-1.0.0rc8.tar.gz
- Subject digest: 08a520ed25776fcb3493dcfc8358e55b69f72f5ccca50b00406ca72ad247dd01
- Sigstore transparency entry: 1733546542
- Sigstore integration time: Jun 5, 2026
Source repository:
- Permalink: glacien-technologies/agent-guardian@8d3f6de57dbf7184f6db357429c6404053b358dc
- Branch / Tag: refs/tags/v1.0.0rc8
- Owner: https://github.com/glacien-technologies
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8d3f6de57dbf7184f6db357429c6404053b358dc
- Trigger Event: push

File details

Details for the file agent_guardian-1.0.0rc8-py3-none-any.whl.

File metadata

Download URL: agent_guardian-1.0.0rc8-py3-none-any.whl
Upload date: Jun 5, 2026
Size: 1.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for agent_guardian-1.0.0rc8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eac3c5dfd43fb422003f89e3fe0f28f7fbb49acef505b0e4749d815ae967d930`
MD5	`5d59b9d14dc2b70d3f703c4cada7afa3`
BLAKE2b-256	`8e0cee42703c3b254a14e0d8789641f7839a99a3cc3e3377368d7a2444ede3b5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_guardian-1.0.0rc8-py3-none-any.whl:

Publisher: publish.yml on glacien-technologies/agent-guardian

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_guardian-1.0.0rc8-py3-none-any.whl
- Subject digest: eac3c5dfd43fb422003f89e3fe0f28f7fbb49acef505b0e4749d815ae967d930
- Sigstore transparency entry: 1733546646
- Sigstore integration time: Jun 5, 2026
Source repository:
- Permalink: glacien-technologies/agent-guardian@8d3f6de57dbf7184f6db357429c6404053b358dc
- Branch / Tag: refs/tags/v1.0.0rc8
- Owner: https://github.com/glacien-technologies
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8d3f6de57dbf7184f6db357429c6404053b358dc
- Trigger Event: push

agent-guardian 1.0.0rc8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

AgentGuardian

Demo

Install

60-second quickstart

What you can scan

What it catches

What you get

Why AgentGuardian

How it works

Scan a real target

Scan modes

CI integration

Standards and coverage

Privacy & telemetry

Docs

Project status

Contributing

Community

Security

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance