Skip to main content

Advanced AI-Powered Penetration Testing Framework with Multi-Agent Orchestration

Project description

Zen-AI-Pentest

๐Ÿ›ก๏ธ Professional AI-Powered Penetration Testing Framework

Python FastAPI License Docker Tests PyPI Version Authors Roadmap Architecture

  graph TB
      subgraph "User Interface"
          CLI[CLI]
          API[REST API]
          WebUI[Web UI]
      end

      subgraph "Core Engine"
          Orchestrator[Agent Orchestrator]
          StateMachine[State Machine]
          RiskEngine[Risk Engine]
      end

      subgraph "AI Agents"
          Recon[Reconnaissance]
          Vuln[Vulnerability]
          Exploit[Exploit]
          Report[Report]
      end

      subgraph "Tools"
          Nmap[Nmap]
          SQLMap[SQLMap]
          Metasploit[Metasploit]
      end

      subgraph "External APIs"
          OpenAI[OpenAI]
          Anthropic[Anthropic]
          ThreatIntel[Threat Intelligence]
      end

      CLI --> API
      WebUI --> API
      API --> Orchestrator
      Orchestrator --> StateMachine
      StateMachine --> Recon
      StateMachine --> Vuln
      StateMachine --> Exploit
      Exploit --> OpenAI
      RiskEngine --> ThreatIntel

Zen-AI-Pentest is an autonomous, AI-powered penetration testing framework that combines cutting-edge language models with professional security tools. Built for security professionals, bug bounty hunters, and enterprise security teams.


โœจ Features

๐Ÿค– Autonomous AI Agent

  • ReAct Pattern: Reason โ†’ Act โ†’ Observe โ†’ Reflect
  • State Machine: IDLE โ†’ PLANNING โ†’ EXECUTING โ†’ OBSERVING โ†’ REFLECTING โ†’ COMPLETED
  • Memory System: Short-term, long-term, and context window management
  • Tool Orchestration: Automatic selection and execution of 20+ pentesting tools
  • Self-Correction: Retry logic and adaptive planning
  • Human-in-the-Loop: Optional pause for critical decisions

๐ŸŽฏ Risk Engine

  • False Positive Reduction: Multi-factor validation with Bayesian filtering
  • Business Impact: Financial, compliance, and reputation risk calculation
  • CVSS/EPSS Scoring: Industry-standard vulnerability assessment
  • Priority Ranking: Automated finding prioritization
  • LLM Voting: Multi-model consensus for accuracy

๐Ÿ”’ Exploit Validation

  • Sandboxed Execution: Docker-based isolated testing
  • Safety Controls: 4-level safety system (Read-Only to Full)
  • Evidence Collection: Screenshots, HTTP captures, PCAP
  • Chain of Custody: Complete audit trail
  • Remediation: Automatic fix recommendations

๐Ÿ“Š Benchmarking

  • Competitor Comparison: vs PentestGPT, AutoPentest, Manual
  • Test Scenarios: HTB machines, OWASP WebGoat, DVWA
  • Metrics: Time-to-find, coverage, false positive rate
  • Visual Reports: Charts and statistical analysis
  • CI Integration: Automated regression testing

๐Ÿ”— CI/CD Integration

  • GitHub Actions: Native action support
  • GitLab CI: Pipeline integration
  • Jenkins: Plugin and pipeline support
  • Output Formats: JSON, JUnit XML, SARIF
  • Notifications: Slack, JIRA, Email alerts
  • Exit Codes: Pipeline-friendly status codes

๐Ÿ› ๏ธ 20+ Integrated Tools

Category Tools
Network Nmap, Masscan, Scapy, Tshark
Web BurpSuite, SQLMap, Gobuster, OWASP ZAP
Exploitation Metasploit Framework
Brute Force Hydra, Hashcat
Reconnaissance Amass, Nuclei, TheHarvester
Active Directory BloodHound, CrackMapExec, Responder
Wireless Aircrack-ng Suite

โ˜๏ธ Multi-Cloud & Virtualization

  • Local: VirtualBox VM Management
  • Cloud: AWS EC2, Azure VMs, Google Cloud Compute
  • Snapshots: Automated clean-state workflows
  • Guest Control: Execute tools inside isolated VMs

๐Ÿš€ Modern API & Backend

  • FastAPI: High-performance REST API
  • PostgreSQL: Persistent data storage
  • WebSocket: Real-time scan updates
  • JWT Auth: Role-based access control (RBAC)
  • Background Tasks: Async scan execution

๐Ÿ“Š Reporting & Notifications

  • PDF Reports: Professional findings reports
  • HTML Dashboard: Interactive web interface
  • Slack/Email: Instant notifications
  • JSON/XML: Integration with other tools

๐Ÿณ Easy Deployment

  • Docker Compose: One-command full stack deployment
  • CI/CD: GitHub Actions pipeline
  • Production Ready: Optimized for enterprise use

๐Ÿš€ Quick Start

Option 1: Docker (Recommended)

# Clone repository
git clone https://github.com/SHAdd0WTAka/zen-ai-pentest.git
cd zen-ai-pentest

# Copy and configure environment
cp .env.example .env
# Edit .env with your settings

# Start full stack
cd docker
docker-compose -f docker-compose.full.yml up -d

# Access:
# Dashboard: http://localhost:3000
# API Docs:  http://localhost:8000/docs
# API:       http://localhost:8000

Option 2: Local Installation

# Install dependencies
pip install -r requirements.txt

# Initialize database
python database/models.py

# Start API server
python api/main.py

Option 3: VirtualBox VM Setup

# Automated Kali Linux setup
python scripts/setup_vms.py --kali

# Manual setup
# See docs/setup/VIRTUALBOX_SETUP.md

๐Ÿ“– Usage

Python API

from agents.react_agent import ReActAgent, ReActAgentConfig

# Configure agent
config = ReActAgentConfig(
    max_iterations=10,
    use_vm=True,
    vm_name="kali-pentest"
)

# Create agent
agent = ReActAgent(config)

# Run autonomous scan
result = agent.run(
    target="example.com",
    objective="Comprehensive security assessment"
)

# Generate report
print(agent.generate_report(result))

REST API

# Authentication
curl -X POST http://localhost:8000/auth/login \
  -H "Content-Type: application/json" \
  -d '{"username":"admin","password":"admin"}'

# Create scan
curl -X POST http://localhost:8000/scans \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name":"Network Scan",
    "target":"192.168.1.0/24",
    "scan_type":"network",
    "config":{"ports":"top-1000"}
  }'

# Execute tool
curl -X POST http://localhost:8000/tools/execute \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "tool_name":"nmap_scan",
    "target":"scanme.nmap.org",
    "parameters":{"ports":"22,80,443"}
  }'

# Generate report
curl -X POST http://localhost:8000/reports \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "scan_id":1,
    "format":"pdf",
    "template":"default"
  }'

WebSocket (Real-Time)

const ws = new WebSocket('ws://localhost:8000/ws/scans/1');

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log('Scan update:', data);
};

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    ZEN-AI-PENTEST v2.0 - System Architecture             โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                          โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚                    FRONTEND LAYER                                โ”‚    โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚   React      โ”‚  โ”‚  WebSocket   โ”‚  โ”‚   CLI Interface      โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚  Dashboard   โ”‚  โ”‚   Client     โ”‚  โ”‚   (Rich/Typer)       โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚                                โ”‚                                         โ”‚
โ”‚                                โ–ผ                                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚                      API LAYER (FastAPI)                         โ”‚    โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚   Auth       โ”‚  โ”‚    Scans     โ”‚  โ”‚   Integrations       โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚   (JWT)      โ”‚  โ”‚   CRUD API   โ”‚  โ”‚   (GitHub/Slack)     โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚                                โ”‚                                         โ”‚
โ”‚                                โ–ผ                                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚                    AUTONOMOUS LAYER                              โ”‚    โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚   ReAct      โ”‚  โ”‚   Memory     โ”‚  โ”‚   Exploit Validator  โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚   Loop       โ”‚  โ”‚   System     โ”‚  โ”‚   (Sandboxed)        โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚                                โ”‚                                         โ”‚
โ”‚                                โ–ผ                                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚                    RISK ENGINE LAYER                             โ”‚    โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚   False      โ”‚  โ”‚   Business   โ”‚  โ”‚   CVSS/EPSS          โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚   Positive   โ”‚  โ”‚   Impact     โ”‚  โ”‚   Scoring            โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚                                โ”‚                                         โ”‚
โ”‚                                โ–ผ                                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚                    TOOLS LAYER (20+)                             โ”‚    โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚    โ”‚
โ”‚  โ”‚  โ”‚ Network: Nmap | Masscan | Scapy | Tshark                โ”‚   โ”‚    โ”‚
โ”‚  โ”‚  โ”‚ Web: BurpSuite | SQLMap | Gobuster | Nuclei | ZAP       โ”‚   โ”‚    โ”‚
โ”‚  โ”‚  โ”‚ Exploit: Metasploit | SearchSploit | ExploitDB          โ”‚   โ”‚    โ”‚
โ”‚  โ”‚  โ”‚ AD: BloodHound | CrackMapExec | Responder               โ”‚   โ”‚    โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚                                โ”‚                                         โ”‚
โ”‚                                โ–ผ                                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚                    DATA & REPORTING LAYER                        โ”‚    โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚  PostgreSQL  โ”‚  โ”‚ Benchmarks   โ”‚  โ”‚   Report Generator   โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ”‚   (Main DB)  โ”‚  โ”‚ & Metrics    โ”‚  โ”‚   (PDF/HTML/JSON)    โ”‚  โ”‚    โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ”‚                                                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ Project Structure

zen-ai-pentest/
โ”œโ”€โ”€ api/                        # FastAPI Backend
โ”‚   โ”œโ”€โ”€ main.py                # API Server
โ”‚   โ”œโ”€โ”€ schemas.py             # Pydantic Models
โ”‚   โ”œโ”€โ”€ auth.py                # JWT Authentication
โ”‚   โ””โ”€โ”€ websocket.py           # WebSocket Manager
โ”œโ”€โ”€ agents/                     # AI Agents
โ”‚   โ”œโ”€โ”€ react_agent.py         # ReAct Agent
โ”‚   โ””โ”€โ”€ react_agent_vm.py      # VM-based Agent
โ”œโ”€โ”€ database/                   # Database Layer
โ”‚   โ””โ”€โ”€ models.py              # SQLAlchemy Models
โ”œโ”€โ”€ virtualization/             # VM Management
โ”‚   โ”œโ”€โ”€ vm_manager.py          # VirtualBox
โ”‚   โ””โ”€โ”€ cloud_vm_manager.py    # AWS/Azure/GCP
โ”œโ”€โ”€ tools/                      # Pentesting Tools
โ”‚   โ”œโ”€โ”€ nmap_integration.py
โ”‚   โ”œโ”€โ”€ sqlmap_integration.py
โ”‚   โ”œโ”€โ”€ metasploit_integration.py
โ”‚   โ””โ”€โ”€ ... (20+ tools)
โ”œโ”€โ”€ gui/                        # Web Interface
โ”‚   โ””โ”€โ”€ vm_manager_gui.py      # React Dashboard
โ”œโ”€โ”€ reports/                    # Report Generation
โ”‚   โ””โ”€โ”€ generator.py           # PDF/HTML/JSON
โ”œโ”€โ”€ notifications/              # Alerts
โ”‚   โ”œโ”€โ”€ slack.py
โ”‚   โ””โ”€โ”€ email.py
โ”œโ”€โ”€ docker/                     # Deployment
โ”‚   โ”œโ”€โ”€ Dockerfile
โ”‚   โ””โ”€โ”€ docker-compose.full.yml
โ”œโ”€โ”€ docs/                       # Documentation
โ”‚   โ”œโ”€โ”€ setup/
โ”‚   โ””โ”€โ”€ research/
โ”œโ”€โ”€ scripts/                    # Setup Scripts
โ”‚   โ””โ”€โ”€ setup_vms.py
โ””โ”€โ”€ tests/                      # Test Suite

๐Ÿ”ง Configuration

Environment Variables

# Database
DATABASE_URL=postgresql://postgres:password@localhost:5432/zen_pentest

# Security
SECRET_KEY=your-secret-key-here
JWT_EXPIRATION=3600

# Notifications
SLACK_WEBHOOK_URL=https://hooks.slack.com/...
SMTP_HOST=smtp.gmail.com
SMTP_USER=user@gmail.com
SMTP_PASS=password

# Cloud Providers
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AZURE_SUBSCRIPTION_ID=...
GCP_PROJECT_ID=...

See .env.example for all options.


๐Ÿงช Testing

# Run all tests
pytest

# With coverage
pytest --cov=. --cov-report=html

# Specific test file
pytest tests/test_react_agent.py -v

# Integration tests
pytest tests/integration/ -v

๐Ÿ“š Documentation


๐Ÿค Contributing

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open Pull Request

Please read CONTRIBUTING.md for details.


โš ๏ธ Disclaimer

IMPORTANT: This tool is for authorized security testing only. Always obtain proper permission before testing any system you do not own. Unauthorized access to computer systems is illegal.

  • Use only on systems you have explicit permission to test
  • Respect privacy and data protection laws
  • The authors assume no liability for misuse or damage

๐Ÿ“„ License

This project is licensed under the MIT License - see LICENSE file for details.


๐Ÿ™ Acknowledgments

  • LangGraph - Agent framework
  • FastAPI - Web framework
  • Kali Linux - Penetration testing distribution
  • All open-source security tool creators

๐ŸŽฏ Advanced Features

Autonomous Mode

The autonomous agent uses ReAct (Reasoning + Acting) pattern for fully automated penetration testing:

# Run autonomous scan
zen-ai-pentest --autonomous --target example.com --goal "Find all vulnerabilities"

# With custom scope
zen-ai-pentest --autonomous --target example.com --scope config/autonomous.json

Features:

  • State Machine: PLANNING โ†’ EXECUTING โ†’ OBSERVING โ†’ REFLECTING โ†’ COMPLETED
  • Memory Management: Short-term, long-term, and context window
  • Tool Orchestration: Automatic selection and execution of 20+ tools
  • Self-Correction: Retry logic and error recovery
  • Human-in-the-Loop: Optional pause for critical decisions
from autonomous import AutonomousAgentLoop

agent = AutonomousAgentLoop(max_iterations=50)
result = await agent.run(
    goal="Find vulnerabilities and open ports",
    target="example.com",
    scope={"depth": "comprehensive"}
)

Risk Engine

Advanced false-positive reduction and risk prioritization:

# Scan with risk validation
zen-ai-pentest --target example.com --autonomous --validate-risks

Components:

  • FalsePositiveEngine: Multi-factor validation using Bayesian filtering and LLM voting
  • BusinessImpactCalculator: Financial, compliance, and reputation impact assessment
  • CVSS/EPSS Scoring: Industry-standard vulnerability scoring
  • Priority Ranking: Automated finding prioritization
from risk_engine import FalsePositiveEngine, BusinessImpactCalculator

# Validate findings
fp_engine = FalsePositiveEngine()
validation = await fp_engine.validate_finding(finding)

# Calculate business impact
impact_calc = BusinessImpactCalculator(
    organization_size="large",
    annual_revenue=100000000,
    industry="finance"
)
impact = impact_calc.calculate_overall_impact(asset_context, finding_type, severity)

CI/CD Integration

Seamless integration with DevSecOps pipelines:

GitHub Actions:

- name: Security Scan
  uses: zen-ai-pentest/action@v2
  with:
    target: ${{ vars.TARGET_URL }}
    fail-on: critical
    format: sarif

GitLab CI:

security-scan:
  image: zen-ai-pentest:latest
  script:
    - zen-ai-pentest --target $TARGET --ci-mode --fail-on high
  artifacts:
    reports:
      sast: gl-sast-report.json

Jenkins:

stage('Security Scan') {
    steps {
        sh 'zen-ai-pentest --target ${TARGET} --ci-mode --fail-on critical'
    }
}

Supported Output Formats:

  • JSON: Machine-readable findings
  • JUnit XML: Test result integration
  • SARIF: Static analysis results format
  • Markdown: Human-readable reports

Exit Codes:

  • 0: Scan passed (no findings above threshold)
  • 1: Findings detected (above threshold)

Benchmarking

Compare Zen AI against competitors:

# Run full benchmark suite
zen-ai-pentest --benchmark

# Quick benchmark
python -c "from benchmarks import run_quick_benchmark; asyncio.run(run_quick_benchmark())"

Benchmarks Include:

  • HackTheBox machines (Lame, Blue, Legacy)
  • OWASP WebGoat scenarios
  • DVWA test cases
  • OWASP Juice Shop challenges

Metrics:

Metric Description
Time to First Finding Speed of initial vulnerability detection
Time to User Initial access achievement time
Time to Root Full compromise time
Findings Count Total vulnerabilities discovered
False Positive Rate Accuracy measurement
Cost per Scan API and compute costs

Competitor Comparison:

Tool HTB Easy FP Rate Cost
Zen AI ~45min ~12% $0.50
PentestGPT ~80min ~28% $1.20
AutoPentest ~120min ~35% $2.00

Exploit Validation

Safe and controlled exploit testing:

# Validate exploit with safety controls
zen-ai-pentest --validate-exploits --target example.com --exploit-type sqli

Safety Levels:

  • READ_ONLY: Passive validation only
  • VALIDATE_ONLY: Validate without full execution
  • CONTROLLED: Controlled execution with limits (default)
  • FULL: Full exploitation (requires explicit approval)

Features:

  • Docker-based sandboxing
  • Evidence collection (screenshots, HTTP captures)
  • Chain of custody tracking
  • Automatic remediation generation
from autonomous import ExploitValidator, ExploitType, ScopeConfig

validator = ExploitValidator(
    safety_level="controlled",
    scope_config=ScopeConfig(allowed_hosts=["example.com"])
)

result = await validator.validate(
    exploit_code="' OR '1'='1",
    target="https://example.com/login",
    exploit_type=ExploitType.WEB_SQLI
)

Notifications & Integrations

Slack Notifications:

from integrations import SlackNotifier

slack = SlackNotifier(webhook_url="...")
await slack.notify_scan_completed(results, target="example.com")

JIRA Integration:

from integrations import JiraIntegration

jira = JiraIntegration(server="...", username="...", api_token="...")
ticket = await jira.create_finding_ticket(finding, project_key="SEC")

Supported Integrations:

  • GitHub (Issues, Check Runs)
  • GitLab (Issues, CI/CD)
  • JIRA (Ticket creation)
  • Slack (Notifications)
  • Jenkins (Pipeline triggers)
  • Email (SMTP alerts)
  • Webhooks (Custom endpoints)

๐Ÿ“ Updated Project Structure

zen-ai-pentest/
โ”œโ”€โ”€ autonomous/                 # Autonomous Agent System
โ”‚   โ”œโ”€โ”€ agent_loop.py          # ReAct Loop Engine
โ”‚   โ”œโ”€โ”€ exploit_validator.py   # Exploit Validation
โ”‚   โ”œโ”€โ”€ memory.py              # Memory Management
โ”‚   โ””โ”€โ”€ tool_executor.py       # Tool Execution
โ”œโ”€โ”€ risk_engine/               # Risk Analysis
โ”‚   โ”œโ”€โ”€ false_positive_engine.py
โ”‚   โ”œโ”€โ”€ business_impact_calculator.py
โ”‚   โ”œโ”€โ”€ cvss.py
โ”‚   โ””โ”€โ”€ epss.py
โ”œโ”€โ”€ benchmarks/                # Benchmark Framework
โ”‚   โ”œโ”€โ”€ run_benchmarks.py
โ”‚   โ””โ”€โ”€ comparison.py
โ”œโ”€โ”€ integrations/              # CI/CD Integrations
โ”‚   โ”œโ”€โ”€ github.py
โ”‚   โ”œโ”€โ”€ gitlab.py
โ”‚   โ”œโ”€โ”€ jira.py
โ”‚   โ”œโ”€โ”€ slack.py
โ”‚   โ””โ”€โ”€ jenkins.py
โ”œโ”€โ”€ config/                    # Configuration Files
โ”‚   โ”œโ”€โ”€ autonomous.json
โ”‚   โ”œโ”€โ”€ risk_engine.json
โ”‚   โ”œโ”€โ”€ benchmarks.json
โ”‚   โ””โ”€โ”€ integrations.json
โ”œโ”€โ”€ api/                       # FastAPI Backend
โ”œโ”€โ”€ agents/                    # AI Agents
โ”œโ”€โ”€ database/                  # Database Layer
โ”œโ”€โ”€ tools/                     # Pentesting Tools
โ””โ”€โ”€ ...

๐Ÿ‘ฅ Authors & Team

Core Development Team

SHAdd0WTAka
@SHAdd0WTAka

Project Founder & Lead Developer
Security Architect
Kimi AI
Kimi AI

AI Development Partner
Architecture & Design

AI Contributors

  • Kimi AI (Moonshot AI) - Primary AI development partner
    • Led architecture design for autonomous agent loop
    • Implemented Risk Engine with false-positive reduction
    • Created CI/CD integration templates
    • Developed benchmarking framework
    • Co-authored documentation and roadmaps

Special Thanks

  • Grok (xAI) - Strategic analysis and competitive research
  • GitHub Copilot - Code assistance and suggestions
  • Security Community - Feedback, bug reports, and feature requests

Contributing

We welcome contributions! See CONTRIBUTORS.md and CONTRIBUTING.md for details.


๐Ÿ“ž Support


Made with โค๏ธ for the security community
ยฉ 2026 Zen-AI-Pentest. All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zen_ai_pentest-2.1.0.tar.gz (320.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zen_ai_pentest-2.1.0-py3-none-any.whl (257.7 kB view details)

Uploaded Python 3

File details

Details for the file zen_ai_pentest-2.1.0.tar.gz.

File metadata

  • Download URL: zen_ai_pentest-2.1.0.tar.gz
  • Upload date:
  • Size: 320.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for zen_ai_pentest-2.1.0.tar.gz
Algorithm Hash digest
SHA256 78cba4e8afc29f132bed84ed4d03aca3c9e7e10d6f9217cb3b20c6b5e6b5fca5
MD5 bbdb8523fcc8f82b1514645fadace174
BLAKE2b-256 e2aa1e0c22bcfa5f9acecf817fa15b844019e9b389cc14748d222e25bca6b038

See more details on using hashes here.

File details

Details for the file zen_ai_pentest-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: zen_ai_pentest-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 257.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for zen_ai_pentest-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7cb52a75940748ae58e6378886e616e9fcdfe36769bdaf03af2c81070b11d934
MD5 906fe1a966a6cfe12565c607175fedff
BLAKE2b-256 5c4f270e7d3119334ef3f7447f4e901dc975b3fa8cdf4f80d3a2fe70b9992a17

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page