Skip to main content

Open-source, local-first AI pentesting agent platform with self-learning capabilities

Project description

  ███████╗███████╗██████╗  █████╗ ██████╗ ██╗  ██╗
  ██╔════╝██╔════╝██╔══██╗██╔══██╗██╔══██╗██║  ██║
  ███████╗█████╗  ██████╔╝███████║██████╔╝███████║
  ╚════██║██╔══╝  ██╔══██╗██╔══██║██╔═══╝ ██╔══██║
  ███████║███████╗██║  ██║██║  ██║██║     ██║  ██║
  ╚══════╝╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝  ╚═╝

The Claude Code of penetration testing.

Python License: MIT Built with Claude uv Tests


Seraph is an AI pentest agent that runs in your terminal. Point it at a target, and it plans, scans, exploits, and escalates — asking your input between phases, streaming every tool call and finding in real time.

It learns from every engagement. The knowledge base (Qdrant + Neo4j + MITRE ATT&CK) continuously improves via LoRA fine-tuning on your retrieval feedback, so the tenth machine in a class is faster than the first.

  seraph> 10.10.10.3
  [*] Starting engagement against 10.10.10.3

    [recon / recon]
    ▸ nmap -sV -sC -oX - 10.10.10.3
    ✓ nmap (14.2s)

    [INFO    ]  SSH on port 22/tcp (OpenSSH 7.4)
    [INFO    ]  HTTP on port 80/tcp (Apache 2.4.6)
    [MEDIUM  ]  Samba 3.0.20 on port 445/tcp

  seraph> exploit the SMB service, it looks like CVE-2007-2447

    [exploit / exploit]
    ▸ metasploit exploit/multi/samba/usermap_script RHOST=10.10.10.3
    ✓ metasploit (8.7s)

    [CRITICAL]  Remote code execution — root shell obtained

  [+] Flags: d9e493...  (root)

Install

Requirements: Python 3.12+, Docker, an Anthropic API key

pip install seraph-suite

Or with uv (recommended):

uv tool install seraph-suite

Then run the one-time setup:

seraph setup

Setup will:

  • Create .env and prompt for your API key
  • Pull and start the Docker services (Qdrant, Neo4j, Redis)
  • Download and ingest the MITRE ATT&CK knowledge base

Usage

Interactive REPL

seraph

Type a target IP or hostname to start. Type anything mid-engagement to steer the agent.

  seraph> 10.10.11.42
  seraph> focus on the web service, port 80
  seraph> findings
  seraph> status
  seraph> clear
  seraph> quit

Quick-start against a target

seraph -t 10.10.10.3

HTB benchmarking

# Single machine
seraph bench --machine Lame --timeout 3600

# All Easy machines with report
seraph bench --difficulty Easy --all --report --output reports/easy.md

Knowledge base ingestion

# NVD CVE feed
seraph ingest nvd --year 2024

# MITRE ATT&CK (auto-downloads the STIX bundle)
seraph ingest mitre --download

# ExploitDB (clone the mirror first)
git clone https://gitlab.com/exploit-database/exploitdb ./data/exploitdb
seraph ingest exploitdb

# Your own CTF writeups (Markdown)
seraph ingest writeups ./data/writeups/

# Check ingestion stats
seraph ingest stats

Sandbox isolation

Run all tool invocations inside isolated Docker containers (Manus-style):

SANDBOX_ENABLED=true seraph -t 10.10.10.3

# Pre-build the agent image
make sandbox-build

How it works

 You
  │   type target / instruction
  ▼
 Orchestrator  ──── Claude Opus (planning)
  │
  ├── Recon Agent    → nmap, gobuster, curl
  ├── Exploit Agent  → metasploit, sqlmap, hydra
  ├── Privesc Agent  → linpeas, sudo checks, SUID
  ├── CTF Agent      → flag hunting, stego, web challenges
  └── Memorist       → logs which KB docs helped
         │
         ▼
  Knowledge Base
  ├── Qdrant   (BM25 + dense hybrid search, RRF fusion)
  ├── Neo4j    (MITRE ATT&CK graph, CVE → technique links)
  └── SQLite   (sessions, feedback, ingestion state)
         │
         ▼
  Self-learning loop
  └── feedback → hard negatives → triplets → LoRA fine-tune

Retrieval pipeline — every KB query runs:

  1. BM25 sparse search (exact CVE IDs, tool names)
  2. Dense semantic search (nomic-embed-text-v1.5, local)
  3. RRF fusion
  4. Neo4j graph traversal (expands CVE → linked techniques)
  5. Cross-encoder reranking (bge-reranker-v2-m3, local)

All embeddings are computed locally — no API calls for embeddings.


Configuration

All settings come from .env. Copy .env.example to get started.

Variable Default Description
ANTHROPIC_API_KEY Required. Your Anthropic key
QDRANT_URL http://localhost:6333 Qdrant vector store
NEO4J_URI bolt://localhost:7687 Neo4j graph store
NEO4J_PASSWORD seraph_secret Neo4j password
REDIS_URL redis://localhost:6379/0 Celery broker
SANDBOX_ENABLED false Docker tool isolation
DENSE_EMBEDDING_MODEL nomic-ai/nomic-embed-text-v1.5 Local embedding model
RERANKER_MODEL BAAI/bge-reranker-v2-m3 Local reranker model
LOG_LEVEL INFO Log verbosity

Services can be managed with:

make up      # start Qdrant + Neo4j + Redis
make down    # stop all services
make dev     # start with dev overrides

Agents

Agent What it does Tools
Orchestrator Plans phases, dispatches sub-agents
Recon Port scanning, service fingerprinting nmap, gobuster, curl
Exploit CVE matching, initial access metasploit, sqlmap, hydra
Privesc Privilege escalation linpeas, custom checks
CTF Flag hunting, stego, web challenges gobuster, curl
Memorist Logs KB feedback for self-learning

When more than 20 tools are available, agents use RAG-based tool selection instead of passing all tools to the LLM.


Self-learning

Every engagement makes Seraph better:

  1. Memorist logs which retrieved documents the LLM cited vs ignored
  2. Hard negatives mined from keyword-similar but semantically wrong retrievals
  3. Triplets (query, positive, negative) accumulated in SQLite
  4. LoRA adapter trained on nomic-embed-text-v1.5 when enough triplets accumulate
  5. Projection layer applied at query time — no need to re-embed the entire corpus

Retrieval quality improves measurably after ~50 engagements on similar machine classes.


Testing

# Unit tests (no services needed)
make test-unit

# All tests + coverage report
make test

# Integration tests (requires services running)
make up && make test-integration

# Sandbox tests (requires Docker + agent image)
make sandbox-build && make sandbox-test

Coverage is enforced at 80%+.


Dashboard

A FastAPI + React 18 dashboard is available:

make api-dev        # API at http://localhost:8000/docs
make dashboard-dev  # UI  at http://localhost:5173

Contributing

Issues and PRs are welcome. Please open an issue before a large PR to align on direction.

  1. Fork and create a branch
  2. Type hints everywhere, Pydantic v2, async I/O, structlog
  3. Write tests first — 80% coverage minimum
  4. make lint && make test-unit before pushing

License

MIT — see LICENSE.


Built by Maciej · Powered by Anthropic Claude

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seraph_suite-1.0.8.tar.gz (456.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seraph_suite-1.0.8-py3-none-any.whl (174.7 kB view details)

Uploaded Python 3

File details

Details for the file seraph_suite-1.0.8.tar.gz.

File metadata

  • Download URL: seraph_suite-1.0.8.tar.gz
  • Upload date:
  • Size: 456.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for seraph_suite-1.0.8.tar.gz
Algorithm Hash digest
SHA256 ca73d20f8475243772369b068245a26a4e5c2b18985f79774aabc08156eddbd8
MD5 28d55fc56b30c7a02f89bd40e4e9de9a
BLAKE2b-256 1f869c0290c955221a1cee372b2718e06bb3d426f811890ea2660707cc72a25d

See more details on using hashes here.

File details

Details for the file seraph_suite-1.0.8-py3-none-any.whl.

File metadata

  • Download URL: seraph_suite-1.0.8-py3-none-any.whl
  • Upload date:
  • Size: 174.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for seraph_suite-1.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 67e1442cf4b4b2a7b26f417e908b3513571126ca7890ee5c18dff3fa55f94a17
MD5 3682f6c3a653ceb9fd676cd5b42385cc
BLAKE2b-256 2d5ad9a5caa83dc92d548978e7e4091ebf91fbbf99fb03fceeadc4ac142a9df7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page