Open-source, local-first AI pentesting agent platform with self-learning capabilities
Project description
```
███████╗███████╗██████╗ █████╗ ██████╗ ██╗ ██╗
██╔════╝██╔════╝██╔══██╗██╔══██╗██╔══██╗██║ ██║
███████╗█████╗ ██████╔╝███████║██████╔╝███████║
╚════██║██╔══╝ ██╔══██╗██╔══██║██╔═══╝ ██╔══██║
███████║███████╗██║ ██║██║ ██║██║ ██║ ██║
╚══════╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝
```
**The Claude Code of penetration testing.**
[](https://www.python.org/)
[](LICENSE)
[](https://anthropic.com)
[](https://github.com/astral-sh/uv)
[](#testing)
---
Seraph is an AI pentest agent that runs in your terminal. Point it at a target, and it plans, scans, exploits, and escalates — asking your input between phases, streaming every tool call and finding in real time.
It learns from every engagement. The knowledge base (Qdrant + Neo4j + MITRE ATT&CK) continuously improves via LoRA fine-tuning on your retrieval feedback, so the tenth machine in a class is faster than the first.
```
seraph> 10.10.10.3
[*] Starting engagement against 10.10.10.3
[recon / recon]
▸ nmap -sV -sC -oX - 10.10.10.3
✓ nmap (14.2s)
[INFO ] SSH on port 22/tcp (OpenSSH 7.4)
[INFO ] HTTP on port 80/tcp (Apache 2.4.6)
[MEDIUM ] Samba 3.0.20 on port 445/tcp
seraph> exploit the SMB service, it looks like CVE-2007-2447
[exploit / exploit]
▸ metasploit exploit/multi/samba/usermap_script RHOST=10.10.10.3
✓ metasploit (8.7s)
[CRITICAL] Remote code execution — root shell obtained
[+] Flags: d9e493... (root)
```
---
## Install
**Requirements:** Python 3.12+, Docker, an [Anthropic API key](https://console.anthropic.com/)
```bash
pip install seraph-suite
```
Or with uv (recommended):
```bash
uv tool install seraph-suite
```
Then run the one-time setup:
```bash
seraph setup
```
Setup will:
- Create `.env` and prompt for your API key
- Pull and start the Docker services (Qdrant, Neo4j, Redis)
- Download and ingest the MITRE ATT&CK knowledge base
---
## Usage
### Interactive REPL
```bash
seraph
```
Type a target IP or hostname to start. Type anything mid-engagement to steer the agent.
```
seraph> 10.10.11.42
seraph> focus on the web service, port 80
seraph> findings
seraph> status
seraph> clear
seraph> quit
```
### Quick-start against a target
```bash
seraph -t 10.10.10.3
```
### HTB benchmarking
```bash
# Single machine
seraph bench --machine Lame --timeout 3600
# All Easy machines with report
seraph bench --difficulty Easy --all --report --output reports/easy.md
```
### Knowledge base ingestion
```bash
# NVD CVE feed
seraph ingest nvd --year 2024
# MITRE ATT&CK (auto-downloads the STIX bundle)
seraph ingest mitre --download
# ExploitDB (clone the mirror first)
git clone https://gitlab.com/exploit-database/exploitdb ./data/exploitdb
seraph ingest exploitdb
# Your own CTF writeups (Markdown)
seraph ingest writeups ./data/writeups/
# Check ingestion stats
seraph ingest stats
```
### Sandbox isolation
Run all tool invocations inside isolated Docker containers (Manus-style):
```bash
SANDBOX_ENABLED=true seraph -t 10.10.10.3
# Pre-build the agent image
make sandbox-build
```
---
## How it works
```
You
│ type target / instruction
▼
Orchestrator ──── Claude Opus (planning)
│
├── Recon Agent → nmap, gobuster, curl
├── Exploit Agent → metasploit, sqlmap, hydra
├── Privesc Agent → linpeas, sudo checks, SUID
├── CTF Agent → flag hunting, stego, web challenges
└── Memorist → logs which KB docs helped
│
▼
Knowledge Base
├── Qdrant (BM25 + dense hybrid search, RRF fusion)
├── Neo4j (MITRE ATT&CK graph, CVE → technique links)
└── SQLite (sessions, feedback, ingestion state)
│
▼
Self-learning loop
└── feedback → hard negatives → triplets → LoRA fine-tune
```
**Retrieval pipeline** — every KB query runs:
1. BM25 sparse search (exact CVE IDs, tool names)
2. Dense semantic search (nomic-embed-text-v1.5, local)
3. RRF fusion
4. Neo4j graph traversal (expands CVE → linked techniques)
5. Cross-encoder reranking (bge-reranker-v2-m3, local)
All embeddings are computed locally — no API calls for embeddings.
---
## Configuration
All settings come from `.env`. Copy `.env.example` to get started.
| Variable | Default | Description |
|---|---|---|
| `ANTHROPIC_API_KEY` | — | **Required.** Your Anthropic key |
| `QDRANT_URL` | `http://localhost:6333` | Qdrant vector store |
| `NEO4J_URI` | `bolt://localhost:7687` | Neo4j graph store |
| `NEO4J_PASSWORD` | `seraph_secret` | Neo4j password |
| `REDIS_URL` | `redis://localhost:6379/0` | Celery broker |
| `SANDBOX_ENABLED` | `false` | Docker tool isolation |
| `DENSE_EMBEDDING_MODEL` | `nomic-ai/nomic-embed-text-v1.5` | Local embedding model |
| `RERANKER_MODEL` | `BAAI/bge-reranker-v2-m3` | Local reranker model |
| `LOG_LEVEL` | `INFO` | Log verbosity |
Services can be managed with:
```bash
make up # start Qdrant + Neo4j + Redis
make down # stop all services
make dev # start with dev overrides
```
---
## Agents
| Agent | What it does | Tools |
|---|---|---|
| **Orchestrator** | Plans phases, dispatches sub-agents | — |
| **Recon** | Port scanning, service fingerprinting | nmap, gobuster, curl |
| **Exploit** | CVE matching, initial access | metasploit, sqlmap, hydra |
| **Privesc** | Privilege escalation | linpeas, custom checks |
| **CTF** | Flag hunting, stego, web challenges | gobuster, curl |
| **Memorist** | Logs KB feedback for self-learning | — |
When more than 20 tools are available, agents use RAG-based tool selection instead of passing all tools to the LLM.
---
## Self-learning
Every engagement makes Seraph better:
1. Memorist logs which retrieved documents the LLM cited vs ignored
2. Hard negatives mined from keyword-similar but semantically wrong retrievals
3. Triplets `(query, positive, negative)` accumulated in SQLite
4. LoRA adapter trained on `nomic-embed-text-v1.5` when enough triplets accumulate
5. Projection layer applied at query time — no need to re-embed the entire corpus
Retrieval quality improves measurably after ~50 engagements on similar machine classes.
---
## Testing
```bash
# Unit tests (no services needed)
make test-unit
# All tests + coverage report
make test
# Integration tests (requires services running)
make up && make test-integration
# Sandbox tests (requires Docker + agent image)
make sandbox-build && make sandbox-test
```
Coverage is enforced at 80%+.
---
## Dashboard
A FastAPI + React 18 dashboard is available:
```bash
make api-dev # API at http://localhost:8000/docs
make dashboard-dev # UI at http://localhost:5173
```
---
## Contributing
Issues and PRs are welcome. Please open an issue before a large PR to align on direction.
1. Fork and create a branch
2. Type hints everywhere, Pydantic v2, async I/O, structlog
3. Write tests first — 80% coverage minimum
4. `make lint && make test-unit` before pushing
---
## License
MIT — see [LICENSE](LICENSE).
---
Built by Maciej · Powered by Anthropic Claude
]]>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
seraph_suite-0.1.0.tar.gz
(453.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
seraph_suite-0.1.0-py3-none-any.whl
(171.4 kB
view details)
File details
Details for the file seraph_suite-0.1.0.tar.gz.
File metadata
- Download URL: seraph_suite-0.1.0.tar.gz
- Upload date:
- Size: 453.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4aa833b6f37fe045ec456cfd292e99cc943f5e03d329f597c26fa89fc4c62865
|
|
| MD5 |
f1f6c920148fae9d39f4f6d9aaad58b6
|
|
| BLAKE2b-256 |
e4ec9d1b5fde66e8914d013f927c28ef57452c94f96d5c4e6661ac5b94c868c4
|
File details
Details for the file seraph_suite-0.1.0-py3-none-any.whl.
File metadata
- Download URL: seraph_suite-0.1.0-py3-none-any.whl
- Upload date:
- Size: 171.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa9ca199b948f2422dac095b72a37693eecf0cb82a541f010d02c56d70c54f17
|
|
| MD5 |
0b94a2e3c2866a00e8dfb4a083814001
|
|
| BLAKE2b-256 |
112e5a012d7da99d000f5d80c00420bc71ef10b79a2cb102156b75aba1a234fc
|