AI Chatbot Penetration Testing Framework
Project description
██████╗ ███████╗███╗ ██╗██████╗ ██████╗ ████████╗
██╔══██╗██╔════╝████╗ ██║██╔══██╗██╔═══██╗╚══██╔══╝
██████╔╝█████╗ ██╔██╗ ██║██████╔╝██║ ██║ ██║
██╔═══╝ ██╔══╝ ██║╚██╗██║██╔══██╗██║ ██║ ██║
██║ ███████╗██║ ╚████║██████╔╝╚██████╔╝ ██║
╚═╝ ╚══════╝╚═╝ ╚═══╝╚═════╝ ╚═════╝ ╚═╝
AI Chatbot Penetration Testing Framework
Multi-Agent Security Testing for AI Systems
Multi-agent adversarial testing framework for AI chatbots and agentic systems. Combines domain-aware campaign planning, OWASP LLM Top 10 + ASI 2026 coverage, and first-class Model Context Protocol (MCP) attack surface testing — built for teams red-teaming production AI.
Why PenBot?
The AI red-team tooling landscape has matured — PyRIT, garak, Meta's
PurpleLlama, and several commercial scanners all ship competent pattern
libraries. PenBot's position is deliberately narrow:
| Capability | Pattern scanners (garak, PurpleLlama) | Campaign frameworks (PyRIT) | PenBot |
|---|---|---|---|
| Static attack corpus | ✅ | ✅ | ✅ (1,120+ patterns) |
| Multi-turn orchestration | ⚠️ limited | ✅ | ✅ |
| Domain-aware campaign planning | ❌ | ⚠️ manual | ✅ automatic |
| Guardrail fingerprinting & targeted bypass | ❌ | ❌ | ✅ |
| Finding chaining (compound exploits) | ❌ | ❌ | ✅ |
| Model Context Protocol (MCP) attack surface | ❌ | ❌ | ✅ 20 patterns, 8 vectors |
| OWASP ASI 2026 (ASI-03 Tool Misuse) | ❌ | ❌ | ✅ |
| Agentic-system action-safety probes | ⚠️ | ⚠️ | ✅ |
| Regression baselines (CI-friendly) | ⚠️ | ⚠️ | ✅ |
| Purple-team / defense simulation mode | ❌ | ❌ | ✅ |
Where PenBot is distinctive
- MCP protocol-surface testing — the only open framework (as of
April 2026) with a dedicated agent for tool-description poisoning,
resource URI traversal,
list_changedbait-and-switch, cross-server pivots, and sampling API abuse. Matches the OWASP ASI-03 Tool Misuse category directly. - Domain-aware coordinator — agents don't just run in sequence; the coordinator picks agents per campaign phase, applies refusal-rate boosts, and composes hybrid attacks when agent pairs have proven compound vectors.
- Two-pass generation — attacks are drafted, critiqued, and refined by a separate reasoning LLM before hitting the target, measurably improving first-pass success rates against guarded targets.
- Finding chaining — the detector layer cross-references findings across rounds, flagging compound exploits (e.g., information disclosure → privilege escalation → exfiltration) as a single higher-severity chain rather than three isolated issues.
Precision & recall caveat
PenBot reports coverage at the category level ("full coverage of LLM01")
— it does not claim every instance will be caught in every target.
We publish per-category precision/recall against the internal benchmark
in each test report, and penbot benchmark scores detection against
intentionally vulnerable mock chatbots so you can calibrate thresholds
for your environment before trusting a finding count.
Production data point
First production test against a live AI chatbot:
| Metric | Result |
|---|---|
| Vulnerabilities found | 15 |
| Test duration | 63 minutes (60 rounds) |
| Success rate | 25% |
| Domain identification | Round 1 |
| Key finding | CRITICAL stored XSS in admin panel via payload logging (fixed immediately) |
Quick Start
Option 1: Install from PyPI (Recommended)
# Core install — CLI + REST API testing
pip install penbot
# Full install — adds dashboard, Playwright browser automation, PDF/DOCX reports, OpenAI support
pip install penbot[full]
# ML install — adds embedding-based attack memory (sentence-transformers, FAISS)
pip install penbot[ml]
Option 2: Install from Source
git clone https://gitlab.com/yan-ban/penbot.git
cd penbot
pip install -e . # Core
pip install -e ".[full]" # Full (optional)
Option 3: Docker
docker pull registry.gitlab.com/yan-ban/penbot:latest
docker run -it -e ANTHROPIC_API_KEY=sk-ant-... registry.gitlab.com/yan-ban/penbot penbot --help
Run PenBot
# 1. First-run setup (creates .env, configures API keys, installs browsers)
penbot onboard
# 2. Configure target (interactive wizard)
penbot wizard
# 3. Run test
penbot test --config configs/clients/your-target.yaml
# Verify your environment anytime
penbot doctor
Quick smoke test:
penbot test --config configs/example.yaml --quick
Start dashboard:
penbot dashboard
# Home / overview: http://localhost:8000/dashboard
# Live Mission Control: http://localhost:8000/dashboard/live
# Session replay: http://localhost:8000/dashboard/session?id=<session_id>
# OWASP compliance: http://localhost:8000/dashboard/owasp
Features
Security Testing
- 14 specialized agents — Jailbreak, encoding, social engineering, RAG, tool exploitation, MCP exploit, exfiltration, indirect injection, action safety, compliance, and more
- 1,398+ attack patterns — Curated across 27 pattern libraries (including 20 MCP protocol-attack patterns) and continuously evolved
- 22 vulnerability detectors — Two-layer detection (pattern + LLM) including SSRF, guardrail fingerprinting, and finding chaining
- OWASP LLM Top 10 (2025) + ASI (2026) coverage — 9/10 LLM categories + dedicated ASI-03 Tool Misuse coverage
- Model Context Protocol (MCP) testing — Tool-description poisoning, resource URI traversal, list_changed bait-and-switch, cross-server pivots, sampling API abuse
- Automatic tool & API discovery — Runtime probing detects tools, functions, and APIs exposed by the target
- Persistence verification — Post-test replay confirms findings are reproducible, not transient
- Endpoint reconnaissance — Two-phase systematic API surface mapping with framework detection
Intelligence
- Think-MCP reasoning — Draft→refine critique cycle, consensus validation, post-response learning
- Domain awareness — LLM-powered domain adaptation in subagent pipeline
- Attack graphs — UCB1 planning + live vis.js dashboard graph
- Strategic guidance — Think-MCP generates per-round strategy that flows to agents
- Structured session summaries — JSON summaries replace lossy text for agent context
- Cross-agent learning — Patterns persist across sessions
- Agent learning loop — Agents track success/failure per round, restore state on restart
- Phase intelligence — Multi-turn attack phases (recon→probe→exploit→persist) with agent boost/penalize
- Evolutionary generation — Novel attacks via genetic algorithms
ML-Enhanced Attack Memory (v2.1)
- Semantic retrieval —
sentence-transformers+ FAISS replaces filter+recency with cosine-similarity nearest-neighbour search - Automatic migration — Existing JSONL attack history is indexed on first use, zero manual steps
- Evolutionary boost —
EvolutionaryAgentselects parent attacks by semantic relevance to the current campaign, not just recency - Evaluation framework — MRR, Precision@k, Recall@k comparison between old and new retrieval
- Embedding visualisation — Jupyter notebook with t-SNE projection, cluster analysis, similarity heatmaps
- Graceful degradation — Falls back to original
AttackMemoryStorewhen ML deps are absent
Monitoring
- Web dashboard — Home overview, live Mission Control, session replay, OWASP report
- Real-time streaming — WebSocket push of attacks, findings, and graph updates
- Attack chain replay — Step-by-step post-test analysis
- Interactive graph — Visualize attack paths
- Detailed reports — HTML with OWASP mapping
- Benchmark suite — Score PenBot against intentionally vulnerable mock chatbots
Flexibility
- REST API or browser automation (Playwright)
- YAML configuration — Easy target setup
- Docker deployment — Production-ready
- Checkpointing — Resume long-running tests
- JWT auth + API keys — Multi-tenant API access for teams
- Continuous testing —
penbot watchre-runs on config/code changes, CI templates included
Screenshots
Mission Control Dashboard
Real-time attack monitoring with interactive graph visualization, campaign metrics, and confirmed findings.
CLI Orchestration
Multi-agent coordination with dual-model architecture (Claude Sonnet 4.5 for analysis, Claude 3.7 Sonnet for attack generation).
Agent Voting & Consensus
Transparent decision-making: agents vote on attack strategies with scored reasoning.
Subagent Refinement Pipeline
Attacks refined through psychological enhancement and stealth layers before execution.
CLI Commands
penbot onboard # First-run setup (env, API keys, browsers)
penbot doctor # Check environment health
penbot wizard # Configure new target
penbot test # Run security test
penbot dashboard # Start Mission Control
penbot sessions # Manage past sessions
penbot agents # Browse 14 agents
penbot patterns # Search attack library
penbot report # Generate report
penbot benchmark # Score detection against mock chatbots
penbot watch # Continuous testing (re-run on config change)
See CLI Reference for full documentation.
Documentation
| Document | Description |
|---|---|
| Developer Guide | How PenBot works under the hood |
| Architecture | System design & diagrams |
| Methodology | Attack strategies |
| Configuration | YAML & environment setup |
| CLI Reference | Command-line usage |
| API Reference | REST & WebSocket |
| Agents | Agent system details |
| Detection | Vulnerability detectors |
| Advanced | RAG, tools, evolutionary |
| OWASP Coverage | Compliance mapping |
| Test Example | Real test walkthrough |
Responsible Use
⚠️ Authorized Testing Only
This tool is for authorized security testing only.
Permitted:
- Testing your own AI chatbots
- Security research with written permission
- Red team exercises (with contract)
- Pre-deployment validation
Prohibited:
- Testing without authorization
- Attacking production systems maliciously
- Extracting proprietary data
- Bypassing security for unauthorized access
Built-in safeguards:
- Authorization verification
- Blocklist for public AI services
- Rate limiting
- Comprehensive audit logging
Technology
- LangGraph — Multi-agent workflow orchestration
- Claude Sonnet 4.5 — Attack generation
- FastAPI — API + WebSocket server (requires
penbot[full]) - Playwright — Browser automation (requires
penbot[full]) - SQLite — Session persistence
Install Extras
| Extra | Command | What it adds |
|---|---|---|
| Core | pip install penbot |
CLI, REST API testing, 13 security agents, 26 attack pattern libraries |
| Full | pip install penbot[full] |
Dashboard, Playwright, PDF/DOCX reports, OpenAI provider, Tavily recon |
| Recon | pip install penbot[recon] |
Tavily web search for target reconnaissance |
| Think | pip install penbot[think] |
MCP-based enhanced reasoning |
| ML | pip install penbot[ml] |
Embedding-based attack memory (sentence-transformers, FAISS) |
| ML-Viz | pip install penbot[ml-viz] |
ML + scikit-learn & matplotlib for notebooks |
Project Status
| Aspect | Status |
|---|---|
| Development | Under Active Development |
| Tests | 1,330+ passing ✅ |
| Skipped | 11 (optional PDF/DOCX deps) |
| Docker | Multi-stage build |
License
MIT License — See LICENSE
References
Academic Papers
-
Kumar, V., Liao, Z., Jones, J., & Sun, H. (2024). "AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts." arXiv:2410.22143
-
Zhang, J., et al. (2025). "Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity." arXiv:2510.01171
Acknowledgments
- Elder Plinius / L1B3RT4S — Jailbreak pattern research
- Manus AI — Context engineering principles
- LangChain — LangGraph framework
- Anthropic
- OWASP — LLM Top 10 framework
Built for a more secure AI future
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file penbot-2.2.0.tar.gz.
File metadata
- Download URL: penbot-2.2.0.tar.gz
- Upload date:
- Size: 705.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
011a544d12651e861b3be23737129741b582090b98a8cf61ad5fef6aa53bfff1
|
|
| MD5 |
3d05dd39cf68f547d0b48282a2248e96
|
|
| BLAKE2b-256 |
5adaf9f61d594b4fcee9f6d4f08c74cc11587b9fec08119074d08f298d8709dc
|
File details
Details for the file penbot-2.2.0-py3-none-any.whl.
File metadata
- Download URL: penbot-2.2.0-py3-none-any.whl
- Upload date:
- Size: 809.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08f9017069e0df5bbfab546090ffa79c85b0a3792fa6d23d093f91b9c61cf8c4
|
|
| MD5 |
16843d6c6333220e9b36b845d218984a
|
|
| BLAKE2b-256 |
5359dfbe3fcb81fa73c769f8daa5b329a8c28a2d48102b5bc1f95aa608f37a9c
|