AI-powered penetration testing via MCP — 21 security tools, PTES methodology, multi-provider
Project description
numasec
AI pentester that actually finds vulnerabilities. Open source. Runs in your terminal.
96% vulnerability recall on OWASP Juice Shop · 10 specialized agents · 21 security tools · PTES methodology
Quickstart
pip install numasec
numasec
Or with Docker:
docker run -it francescosta/numasec
Or from source:
curl -fsSL https://numasec.dev/install | bash
Type /target https://yourapp.com and watch it work. The AI scans, finds vulnerabilities, chains attacks together, and writes the report. You watch, approve, and steer.
Works with Claude, GPT-4, Gemini, DeepSeek, Mistral, or any OpenAI-compatible model.
Why numasec
Most "AI security tools" wrap a single scanner and call it AI. numasec is different — it's a team of 10 specialized agents running 21 offensive security tools through an actual penetration testing methodology.
It doesn't just find vulnerabilities. It chains them: a leaked API key in JavaScript → SSRF → cloud metadata → account takeover. Then it writes a professional report with CVSS scores, CWE IDs, OWASP categories, and remediation guidance.
Benchmarked against real targets:
| Target | Vulnerabilities Found | Coverage |
|---|---|---|
| OWASP Juice Shop v17 | 25/26 ground-truth vulns | 96% recall |
| DVWA | 7/7 vulnerability categories | 100% |
| WebGoat | 20+ vulnerabilities across all modules | Full coverage |
What it finds
|
Injection
|
Authentication & Access
|
Client & Server Side
|
Every finding is auto-enriched with CWE ID, CVSS 3.1 score, OWASP Top 10 category, MITRE ATT&CK technique, and actionable remediation guidance.
Multi-Agent Architecture
numasec isn't a single bot — it's a coordinated team of specialized agents, each with distinct roles and permissions:
Primary Agents
| Agent | Role | What it does |
|---|---|---|
| 🔴 pentest | Full PTES methodology | Recon → Discovery → Vuln Assessment → Exploitation → Reporting |
| 🔵 recon | Intelligence gathering | Port scanning, fingerprinting, subdomain enum, service probing — no exploitation |
| 🟠 hunt | OWASP Top 10 hunter | Systematic, aggressive testing across all 10 OWASP categories |
| 🟡 review | Secure code review | Static analysis of source code, diffs, commits, PRs |
| 🟢 report | Report & findings | Finding management, severity validation, report generation |
Subagents
| Agent | Role |
|---|---|
| scanner | Executes automated vulnerability scans (passive → semi-active → active) |
| analyst | Validates results, eliminates false positives, correlates attack chains |
| reporter | Generates SARIF / Markdown / HTML / JSON reports |
| explore | CVE research, exploit documentation, knowledge base queries |
Each agent has tailored permissions — the recon agent can't run exploits, the review agent can't launch scanners. The analyst agent filters false positives using strict evidence criteria before any finding enters the report.
Security Tooling
21 purpose-built security tools and 38 async scanners under the hood — covering reconnaissance, injection testing, authentication attacks, access control, file upload bypass, race conditions, request smuggling, out-of-band detection, and more. The AI selects and orchestrates them automatically based on what it discovers about your target.
A built-in knowledge base of 34 templates covers detection patterns, exploitation techniques, payloads, and remediation — so the AI doesn't hallucinate attack methodology, it looks it up. Extensible with your own templates and plugins.
Reports
Four output formats, all auto-generated:
| Format | Use case |
|---|---|
| SARIF | Drop into GitHub Code Scanning, GitLab SAST, or any SARIF viewer |
| HTML | Self-contained report to share with your team |
| Markdown | Paste into tickets, docs, or wikis |
| JSON | Feed into your pipeline or dashboard |
Every report includes an executive summary with risk score (0-100), severity breakdown, OWASP coverage matrix, attack chain documentation, and per-finding remediation.
OWASP Top 10 Coverage
The TUI header tracks real-time testing coverage across all 10 OWASP categories as the pentest progresses. Each category is automatically mapped to the relevant tools — so you always know what's been tested and what's left.
Installation
pip (recommended)
pip install numasec
numasec
Downloads the TUI binary automatically on first run. No Bun, Node, or other runtime needed.
Docker
docker run -it francescosta/numasec
Full TUI + all 21 security tools. Multi-arch (amd64, arm64).
From source
curl -fsSL https://numasec.dev/install | bash
Or manually:
git clone https://github.com/FrancescoStabile/numasec.git
cd numasec
pip install -e ".[all]" # Python backend
cd agent && bun install && bun run build # TUI
Usage
numasec # Start interactive TUI
Slash Commands
| Command | Description |
|---|---|
/target <url> |
Set target and begin reconnaissance |
/findings |
List all discovered vulnerabilities |
/report <format> |
Generate report (markdown, html, sarif, json) |
/coverage |
Show OWASP Top 10 coverage matrix |
/creds |
List discovered credentials |
/evidence <id> |
Show evidence for a specific finding |
/review |
Security review of code changes |
/init |
Analyze app and create security profile |
Agent Modes
Switch between agents for different tasks:
- pentest — full methodology, default
- recon — reconnaissance only, no exploitation
- hunt — aggressive OWASP Top 10 testing
- review — secure code review (no network scanning)
- report — finding management and deliverables
LLM Providers
| Provider | Models |
|---|---|
| Anthropic | Claude Opus, Sonnet, Haiku |
| OpenAI | GPT-4o, GPT-4, o1 |
| Gemini Pro, Flash | |
| AWS Bedrock | Claude, Llama |
| Azure OpenAI | GPT-4, GPT-4o |
| Mistral | Large, Medium |
| DeepSeek | V2, Coder |
| OpenRouter | Any model via aggregation |
| GitHub Copilot | Copilot models |
| Google Vertex | Gemini via Vertex |
| GitLab | GitLab models |
Development
pip install -e ".[all]"
# Tests (1273 unit + 3 benchmark suites)
pytest tests/ -v
pytest tests/ -m "not slow and not benchmark" # fast run
# Lint & type check
ruff check numasec/
ruff format numasec/
mypy numasec/
# TypeScript TUI
cd agent && bun install
cd packages/numasec && bun run typecheck
cd packages/numasec && bun test
Benchmarks
# Juice Shop (96% recall)
JUICE_SHOP_URL=http://localhost:3000 pytest tests/benchmarks/test_juice_shop.py -v
# DVWA (100% coverage)
DVWA_TARGET=http://localhost:8080 pytest tests/benchmarks/test_dvwa.py -v
# WebGoat
WEBGOAT_TARGET=http://localhost:8081/WebGoat pytest tests/benchmarks/test_webgoat.py -v
Extend with plugins
Drop a Python file with a register(registry) function into ~/.numasec/plugins/ or a YAML scanner template into ~/.numasec/templates/.
How it works
┌─────────────────────────────────────────────────────────────┐
│ Terminal TUI │
│ (TypeScript/Bun • SolidJS reactive UI • 5 agent modes) │
└────────────────────────────┬────────────────────────────────┘
│
┌────────────────────────────▼────────────────────────────────┐
│ Security Engine │
│ ┌─────────────┐ ┌───────────────┐ ┌───────────────────┐ │
│ │ 21 Security │ │ 34 Knowledge │ │ Session Store │ │
│ │ Tools │ │ Base Templates│ │ │ │
│ └──────┬──────┘ └───────────────┘ └───────────────────┘ │
│ │ │
│ ┌──────▼──────────────────────────────────────────────┐ │
│ │ 38 Skills │ │
│ │ Injection · Auth · Access · Recon · Fuzzing │ │
│ │ Client-side · Server-side · Out-of-band · ... │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
The TUI drives the AI conversation. The AI calls security tools. Each tool orchestrates one or more async scanners. Findings are auto-enriched (CWE → CVSS → OWASP → MITRE ATT&CK), deduplicated, and grouped into attack chains. Reports are generated from the session store.
No hallucinated methodology. The knowledge base provides real detection patterns, exploitation techniques, and payloads. The deterministic planner (based on the CHECKMATE paper) selects tests based on detected technologies — no LLM involved in test selection.
Built by Francesco Stabile.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file numasec-4.1.3.tar.gz.
File metadata
- Download URL: numasec-4.1.3.tar.gz
- Upload date:
- Size: 592.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a37540d84b8fbf65398d92a35058616e6af6f241c4033018e9f717306fc7eb38
|
|
| MD5 |
ab6e641d16ce2d90e0b6bebca31685f4
|
|
| BLAKE2b-256 |
a3ae616ddcaa93e57d44c359b2c41e4d7c6918d81ca1796df80194774e252950
|
Provenance
The following attestation bundles were made for numasec-4.1.3.tar.gz:
Publisher:
release.yml on FrancescoStabile/numasec
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
numasec-4.1.3.tar.gz -
Subject digest:
a37540d84b8fbf65398d92a35058616e6af6f241c4033018e9f717306fc7eb38 - Sigstore transparency entry: 1252425678
- Sigstore integration time:
-
Permalink:
FrancescoStabile/numasec@0154424980652cdea8600b55766bacfe4723c020 -
Branch / Tag:
refs/tags/v4.1.3 - Owner: https://github.com/FrancescoStabile
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0154424980652cdea8600b55766bacfe4723c020 -
Trigger Event:
push
-
Statement type:
File details
Details for the file numasec-4.1.3-py3-none-any.whl.
File metadata
- Download URL: numasec-4.1.3-py3-none-any.whl
- Upload date:
- Size: 480.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12f0e9142f4b782558bd3619bbc22f2409233614b47c40b930bce4f87009405c
|
|
| MD5 |
3f76dc16c4cc225b4371bed752ff78c1
|
|
| BLAKE2b-256 |
696653a54b06e8ecef816892b0efe7a41aa3d76edafe68def71a2795ba715fbd
|
Provenance
The following attestation bundles were made for numasec-4.1.3-py3-none-any.whl:
Publisher:
release.yml on FrancescoStabile/numasec
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
numasec-4.1.3-py3-none-any.whl -
Subject digest:
12f0e9142f4b782558bd3619bbc22f2409233614b47c40b930bce4f87009405c - Sigstore transparency entry: 1252425684
- Sigstore integration time:
-
Permalink:
FrancescoStabile/numasec@0154424980652cdea8600b55766bacfe4723c020 -
Branch / Tag:
refs/tags/v4.1.3 - Owner: https://github.com/FrancescoStabile
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0154424980652cdea8600b55766bacfe4723c020 -
Trigger Event:
push
-
Statement type: