Skip to main content

Autonomous security scanner for AI agents. Detects prompt injection, tool abuse, data exfiltration, and OWASP ASI Top 10 vulnerabilities.

Project description

AgentGuard

Autonomous security scanner for AI agents. Detects prompt injection, tool abuse, data exfiltration, and OWASP ASI Top 10 vulnerabilities in agent code.

PyPI Python 3.10+ License: MIT CI OWASP ASI


Why AgentGuard?

AI agents are being deployed at scale -- in coding tools, customer support, trading bots, and autonomous systems. Nobody is scanning their code for security vulnerabilities.

Existing tools (Bandit, Semgrep, CodeQL) scan for traditional vulnerabilities. AgentGuard scans for agent-specific attack vectors that traditional SAST tools miss.

Comparison

Feature AgentGuard Semgrep CodeQL Bandit
Prompt Injection (ASI01) Yes + AST taint No No No
Tool Abuse (ASI02) Yes No No Partial
Data Exfiltration (ASI03) Yes No No No
Excessive Agency (ASI04) Yes No No No
Supply Chain (ASI05) Yes No No No
Insecure Output (ASI06) Yes No No No
Credential Exposure (ASI07) Yes Partial Partial Yes
Context Manipulation (ASI08) Yes No No No
Agent Loop Exploitation (ASI09) Yes No No No
Trust Boundary (ASI10) Yes No No No
AST Taint Tracking Yes No No No
OWASP ASI Top 10 Coverage 10/10 1/10 1/10 2/10
MCP Server Mode Yes No No No
SARIF Output Yes Yes Yes No
Pre-commit Hook Yes Yes No No
GitHub Action Yes Yes Yes No

Quick Start

pip install dfx-agentguard

# Scan a directory
agentguard .

# JSON output for CI/CD
agentguard src/ --format json

# SARIF for GitHub Code Scanning
agentguard . --format sarif > results.sarif

# Only show HIGH and above
agentguard . --min-severity HIGH

# Include test files in scan
agentguard . --include-tests

CLI Usage

agentguard [OPTIONS] [TARGET]

Arguments:
  TARGET                   Directory or file to scan (default: current directory)

Options:
  --format [text|json|sarif]   Output format (default: text)
  --exit-code / --no-exit-code  Exit non-zero if findings found (default: on)
  --min-severity [CRITICAL|HIGH|MEDIUM|LOW|INFO]  Minimum severity to report
  --include-tests               Include test files in scan (default: skip)
  --help                        Show help

OWASP ASI Top 10 Coverage

ID Vulnerability Status Detection Method
ASI01 Prompt Injection Detected f-string, .format(), messages array, context stuffing, tool description poisoning
ASI02 Tool Abuse / Unintended Tool Use Detected os.system, subprocess, shell tools, unrestricted registration
ASI03 Data Exfiltration Detected External URLs, variable URL correlation, fetch/axios, subprocess curl, DNS exfil
ASI04 Unauthorized Actions / Excessive Agency Detected Auto-execute, no confirmation, autonomous actions
ASI05 Supply Chain / Untrusted Components Detected Dynamic import, unpinned deps, untrusted pip install
ASI06 Insecure Output Handling Detected LLM output in HTML/JSX/DOM, innerHTML, document.write, markdown.render
ASI07 Credential / Secret Exposure Detected API keys (sk-, ghp_, AKIA, AIza, xox), private keys, passwords, connection strings
ASI08 Context Window Manipulation Detected Unbounded context, token stuffing, missing limits
ASI09 Agent Loop Exploitation Detected Recursive calls without depth limit, while True, no max iterations
ASI10 Trust Boundary Violation Detected Root access, host filesystem mounts, no sandbox, self-modification

CI/CD Integration

GitHub Action

name: Security Scan
on: [push, pull_request]

jobs:
  agentguard:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install dfx-agentguard
      - run: agentguard . --format sarif > results.sarif
      - uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: results.sarif

Drop-in GitHub Action

- uses: dockfixlabs/agentguard@v0.4.0
  with:
    path: src/
    format: sarif

Pre-commit Hook

repos:
  - repo: https://github.com/dockfixlabs/agentguard
    rev: v0.4.0
    hooks:
      - id: agentguard
        args: ["--min-severity", "HIGH"]

Programmatic Usage

from agentguard.scanner import scan_directory

result = scan_directory("src/")

print(f"Found {len(result.findings)} issues")
print(f"Critical: {result.critical_count}")
print(f"High: {result.high_count}")

for finding in result.findings:
    print(f"  [{finding.severity}] {finding.rule_name} at {finding.file}:{finding.line}")

MCP Server Mode

Scan agent code directly from Claude Code, Cursor, or any MCP-compatible client:

{
  "mcpServers": {
    "agentguard": {
      "command": "python3",
      "args": ["-m", "agentguard.mcp_server"]
    }
  }
}

Then ask Claude: "Scan my agent code for security vulnerabilities"

Benchmark Results

Tested against 28 vulnerable code samples + 8 real-world attack patterns:

Category      Total   Detected     Rate    FP
ASI01             6          6     100%     0
ASI02             5          5     100%     0
ASI03             4          4     100%     0
ASI07             6          6     100%     0
ASI10             5          5     100%     0
clean             2          0       -      0
TOTAL            28         26    100%     0

100% detection rate, 0% false positives.

Project Ecosystem

Repository Description
agentguard Core scanner + CLI + MCP server
mcp-scanner MCP server configuration scanner
agentguard-app GitHub App for automated PR reviews
agentguard-vscode VS Code extension
agentguard-benchmark Benchmark suite (28 samples)

Roadmap

  • OWASP ASI Top 10 -- all 10 categories covered
  • MCP server mode -- scan from Claude Code/Cursor
  • SARIF output -- GitHub Code Scanning integration
  • PyPI publication -- dfx-agentguard
  • VS Code extension
  • GitHub App for PR reviews
  • Benchmark suite (28 samples, 100% detection)
  • Pre-commit hook (.pre-commit-hooks.yaml)
  • GitHub Action (action.yml)
  • Dockerfile for agentguard-app
  • PyPI Trusted Publishing (OIDC)
  • AST-based taint tracking (v0.5.0) -- traces source-to-sink data flow
  • Language support: Rust, Go, Java
  • Web dashboard (SaaS)
  • REST API (Scan-as-a-Service)

See the full ROADMAP.md.

Contributing

See CONTRIBUTING.md. Bug reports and feature requests welcome.

Security

See SECURITY.md. Report vulnerabilities privately -- do not open public issues.

License

MIT -- see LICENSE.


Built by Dockfix Labs. Built for the AI agent era.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfx_agentguard-0.5.3.tar.gz (36.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dfx_agentguard-0.5.3-py3-none-any.whl (43.1 kB view details)

Uploaded Python 3

File details

Details for the file dfx_agentguard-0.5.3.tar.gz.

File metadata

  • Download URL: dfx_agentguard-0.5.3.tar.gz
  • Upload date:
  • Size: 36.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dfx_agentguard-0.5.3.tar.gz
Algorithm Hash digest
SHA256 ee5feae67cb37aa9f2e72506bc0bad488f4cb411cb5853969a0420a66d10d89b
MD5 fa4a678cda350b7a9fdfd9869a11f217
BLAKE2b-256 25b32121c9e51d4aff99b198dc445b8688f727cf78c914f7217ad122790c1b60

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfx_agentguard-0.5.3.tar.gz:

Publisher: publish.yml on dockfixlabs/agentguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dfx_agentguard-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: dfx_agentguard-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 43.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dfx_agentguard-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 73be790e6c06c96cedd364b79117d62d166c6b728f4c28014fd54f4c5b437992
MD5 6c788e252df7ae0bbb00c24f6f87bc83
BLAKE2b-256 ca5669064e93b54babadade37bb3bfed64cd90ea3cf772b2ecc5dce844ffd54f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dfx_agentguard-0.5.3-py3-none-any.whl:

Publisher: publish.yml on dockfixlabs/agentguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page