Offensive security testing for AI agents

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

sumam

These details have not been verified by PyPI

Project description

ProbeAgent

Offensive security testing for AI agents. They scan configs. We attack your agent.

What is ProbeAgent?

ProbeAgent is a CLI tool that performs automated red-teaming of AI agents. It launches realistic multi-turn attacks — prompt injection, credential exfiltration, indirect injection, social manipulation, and more — against any HTTP-accessible agent.

Most AI security tools scan static configurations or check for known patterns. ProbeAgent actually attacks your running agent and tells you whether it's Safe, At Risk, or Compromised.

How It Works

flowchart LR
    CLI[probeagent attack] --> Engine
    Engine --> |for each category| Attack[Attack Module]
    Attack --> |reset conversation| Target
    Attack --> |multi-turn prompts| Target
    Target --> |response| Analyzer
    Analyzer --> |grade| Report[Safe / At Risk / Compromised]

Why ProbeAgent?

Feature	mcp-scan	SecureClaw	Aguara	ProbeAgent
Offensive testing	-	-	Partial	Yes
Multi-turn attacks	-	-	-	Yes
Indirect injection testing	-	-	-	Yes
PyRIT integration	-	-	-	Yes
Evasion converters	-	-	-	Yes
CLI-first	-	-	Yes	Yes
Security grading	-	-	-	Yes
HTTP + OpenClaw targets	-	-	-	Yes
Rich terminal reports	-	-	-	Yes

Installation

pip install probeagent-ai

Or install from source for development:

git clone https://github.com/sumamovva/probeagent.git
cd probeagent
pip install -e ".[dev]"

For PyRIT integration (evasion converters + dynamic red teaming):

pip install 'probeagent-ai[pyrit]'

Quickstart

Instant demo (no setup required)

pip install probeagent-ai
probeagent demo

This attacks a built-in mock target — a vulnerable agent and a hardened one — and shows a side-by-side comparison. No API keys, no server, no config.

Scan your own agent

# Validate your target is reachable
probeagent validate https://your-agent.example.com/api

# Run a quick security scan
probeagent attack https://your-agent.example.com/api --profile quick

# Full scan with parallel execution
probeagent attack https://your-agent.example.com/api --profile standard --parallel

Scan an OpenClaw agent

# Validate an OpenClaw instance (auto-detects OpenAI chat format)
probeagent validate http://localhost:3000/v1/chat/completions \
  -H 'Authorization: Bearer YOUR_TOKEN'

# Attack it
probeagent attack http://localhost:3000/v1/chat/completions \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  --profile standard --parallel

Demo

Instant demo

Run a complete security assessment in seconds with zero setup:

probeagent demo

Add the War Room tactical display for a visual experience:

probeagent demo --game

Live demo (real API)

For demos against a real Claude-powered email agent with built-in vulnerabilities:

export ANTHROPIC_API_KEY=sk-ant-...
pip install 'probeagent-ai[demo]'
probeagent demo --live

The live demo starts a local email agent server with three endpoints at increasing security hardness, then attacks them. See tools/demo_email_agent.py for details.

Commands

`probeagent demo`

Run a full demo — attack a vulnerable + hardened target and compare results.

probeagent demo                    # Instant, uses mock target
probeagent demo --game             # With War Room tactical display
probeagent demo --live             # Real API (requires ANTHROPIC_API_KEY)
probeagent demo --profile standard # Use a different attack profile

Options:

--live — Use real API (starts demo email agent server)
--game — Launch War Room UI after attacks
--profile, -p — Attack profile: quick, standard, or thorough (default: quick)

`probeagent attack <url>`

Run security attacks against a target AI agent.

probeagent attack https://agent.example.com/api --profile quick
probeagent attack https://agent.example.com/api --profile standard --output json -f report.json
probeagent attack https://agent.example.com/api -p standard --converters stealth --parallel

Options:

--profile, -p — Attack profile: quick, standard, or thorough (default: quick)
--target-type — Target type: http or openclaw (default: http)
--output, -o — Output format: terminal, markdown, json (default: terminal)
--output-file, -f — Write report to file
--timeout, -t — Request timeout in seconds (default: 30)
--parallel — Run attack categories in parallel for faster scans
--converters — Apply evasion converters: basic, advanced, stealth, or comma-separated names (requires PyRIT)
--redteam — Enable dynamic LLM-driven attacks via PyRIT RedTeamOrchestrator (requires PyRIT)
--header, -H — HTTP header as Key: Value (repeatable, e.g. -H 'Authorization: Bearer token')

`probeagent validate <url>`

Check if a target is reachable and detect its API format. Supports --header/-H for authenticated targets.

`probeagent list-attacks`

Show all available attack modules with severity and status.

`probeagent init`

Create a default .probeagent.yaml config file in the current directory.

`probeagent game [url]`

Launch the War Room tactical display UI in your browser for interactive testing.

Attack Categories

12 attack categories with 79 strategies total:

Category	Severity	Strategies	Technique
Prompt Injection	CRITICAL	6	Override system instructions
Credential Exfiltration	CRITICAL	8	Extract API keys and secrets
Identity Spoofing	CRITICAL	7	Impersonate trusted entities
Indirect Injection	CRITICAL	7	Inject instructions via agent-processed content (emails, docs)
Config Manipulation	CRITICAL	6	Manipulate agent configuration, integrations, and permissions
Goal Hijacking	HIGH	5	Redirect agent behavior
Social Manipulation	HIGH	14	Psychological pressure (Cialdini, FOG, gradual escalation)
Cognitive Exploitation	HIGH	6	Exploit reasoning weaknesses (Socratic traps, frame control)
Resource Abuse	HIGH	4	Trigger unbounded computation
Tool Misuse	HIGH	6	Trick agent into misusing tools
Agentic Exploitation	CRITICAL	10	SSRF, command injection, path traversal, supply chain (CVE-based)
Data Exfiltration	MEDIUM	6	Extract sensitive context data

Attack Profiles

Profile	Categories	Max Turns	Use Case
`quick`	5 critical	1	CI/CD gates, quick checks
`standard`	All 12	3	Regular security assessments
`thorough`	All 12	10	Pre-release deep scans

PyRIT Integration

ProbeAgent optionally integrates with Microsoft PyRIT for advanced capabilities:

Evasion Converters (--converters): Transform attack payloads with Base64, ROT13, Unicode substitution, leetspeak, and more to test resilience against obfuscated attacks
Dynamic Red Teaming (--redteam): Use an LLM-driven orchestrator to generate novel attack strategies in real time

# Apply stealth evasion converters
probeagent attack https://agent.example.com/api -p standard --converters stealth

# Dynamic red teaming
probeagent attack https://agent.example.com/api -p standard --redteam

# Combine both
probeagent attack https://agent.example.com/api -p standard --converters advanced --redteam

Install with: pip install 'probeagent-ai[pyrit]'

Responsible Use

ProbeAgent is designed for authorized security testing only. Before using ProbeAgent:

Ensure you have explicit permission to test the target system
Only test systems you own or have written authorization to test
Follow your organization's security testing policies
Report vulnerabilities through proper disclosure channels

Unauthorized use of this tool against systems you don't own or have permission to test may violate laws and regulations.

Attribution

ProbeAgent's indirect injection and config manipulation attacks are inspired by research from Zenity Labs. PyRIT integration uses components from Microsoft PyRIT (MIT License). See ATTRIBUTION.md for full credits.

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
python -m pytest tests/ -v

# Lint
ruff check src/ tests/

# Format
ruff format src/ tests/

See CONTRIBUTING.md for full development guidelines.

Roadmap

Phase 1: CLI, HTTP target, scoring, reporting
Phase 2: 9 attack categories with 56 multi-turn strategies
Phase 3: OpenClaw target adapter, parallel execution, War Room UI
Phase 4: Zenity-inspired attacks (indirect injection, config manipulation), PyRIT integration
Phase 5: MCP target adapter, CI/CD integration, SaaS dashboard

License

Apache 2.0 — see LICENSE for details.

Changelog

See CHANGELOG.md for version history.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

sumam

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.4

Mar 17, 2026

This version

0.1.3

Mar 10, 2026

0.1.2

Mar 6, 2026

0.1.1

Mar 4, 2026

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

probeagent_ai-0.1.3.tar.gz (100.4 kB view details)

Uploaded Mar 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

probeagent_ai-0.1.3-py3-none-any.whl (109.4 kB view details)

Uploaded Mar 10, 2026 Python 3

File details

Details for the file probeagent_ai-0.1.3.tar.gz.

File metadata

Download URL: probeagent_ai-0.1.3.tar.gz
Upload date: Mar 10, 2026
Size: 100.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for probeagent_ai-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`51d6d70d4aa2a5256f03e4d623bb194a0ff73c6d3b9831c9d7f9de4d8a35ba34`
MD5	`52fc06b1cd7da0ad23b8a56c0d5bc955`
BLAKE2b-256	`5385b249f6a6218a1902fecab9bf6e147d8a586b8010d46a20f9bd459a5b496f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for probeagent_ai-0.1.3.tar.gz:

Publisher: publish.yml on sumamovva/probeagent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: probeagent_ai-0.1.3.tar.gz
- Subject digest: 51d6d70d4aa2a5256f03e4d623bb194a0ff73c6d3b9831c9d7f9de4d8a35ba34
- Sigstore transparency entry: 1071085222
- Sigstore integration time: Mar 10, 2026
Source repository:
- Permalink: sumamovva/probeagent@818cd4330ea86e402680961bb09f6d50b14ebef3
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/sumamovva
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@818cd4330ea86e402680961bb09f6d50b14ebef3
- Trigger Event: release

File details

Details for the file probeagent_ai-0.1.3-py3-none-any.whl.

File metadata

Download URL: probeagent_ai-0.1.3-py3-none-any.whl
Upload date: Mar 10, 2026
Size: 109.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for probeagent_ai-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`052f0ab15452456348832abedcd3eb0107719f2f6747809a0561fbed4040ad58`
MD5	`da3491ae4b0134bd626d12ab1460f11d`
BLAKE2b-256	`16fbb350fdde46995a606b84472d6fe7bd2f656b60cdc402e96c6bdc8d11cb3c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for probeagent_ai-0.1.3-py3-none-any.whl:

Publisher: publish.yml on sumamovva/probeagent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: probeagent_ai-0.1.3-py3-none-any.whl
- Subject digest: 052f0ab15452456348832abedcd3eb0107719f2f6747809a0561fbed4040ad58
- Sigstore transparency entry: 1071085414
- Sigstore integration time: Mar 10, 2026
Source repository:
- Permalink: sumamovva/probeagent@818cd4330ea86e402680961bb09f6d50b14ebef3
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/sumamovva
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@818cd4330ea86e402680961bb09f6d50b14ebef3
- Trigger Event: release

probeagent-ai 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

ProbeAgent

What is ProbeAgent?

How It Works

Why ProbeAgent?

Installation

Quickstart

Instant demo (no setup required)

Scan your own agent

Scan an OpenClaw agent

Demo

Instant demo

Live demo (real API)

Commands

probeagent demo

probeagent attack <url>

probeagent validate <url>

probeagent list-attacks

probeagent init

probeagent game [url]

Attack Categories

Attack Profiles

PyRIT Integration

Responsible Use

Attribution

Development

Roadmap

License

Changelog

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`probeagent demo`

`probeagent attack <url>`

`probeagent validate <url>`

`probeagent list-attacks`

`probeagent init`

`probeagent game [url]`