Skip to main content

Open-source, local-first AI pentesting agent platform with self-learning capabilities

Project description

  ███████╗███████╗██████╗  █████╗ ██████╗ ██╗  ██╗
  ██╔════╝██╔════╝██╔══██╗██╔══██╗██╔══██╗██║  ██║
  ███████╗█████╗  ██████╔╝███████║██████╔╝███████║
  ╚════██║██╔══╝  ██╔══██╗██╔══██║██╔═══╝ ██╔══██║
  ███████║███████╗██║  ██║██║  ██║██║     ██║  ██║
  ╚══════╝╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝  ╚═╝

The Claude Code of penetration testing.

Python License: MIT Built with Claude uv Tests


Seraph is an AI pentest agent that runs in your terminal. Point it at a target, and it plans, scans, exploits, and escalates — asking your input between phases, streaming every tool call and finding in real time.

It learns from every engagement. The knowledge base (Qdrant + Neo4j + MITRE ATT&CK) continuously improves via LoRA fine-tuning on your retrieval feedback, so the tenth machine in a class is faster than the first.

  seraph> 10.10.10.3
  [*] Starting engagement against 10.10.10.3

    [recon / recon]
    ▸ nmap -sV -sC -oX - 10.10.10.3
    ✓ nmap (14.2s)

    [INFO    ]  SSH on port 22/tcp (OpenSSH 7.4)
    [INFO    ]  HTTP on port 80/tcp (Apache 2.4.6)
    [MEDIUM  ]  Samba 3.0.20 on port 445/tcp

  seraph> exploit the SMB service, it looks like CVE-2007-2447

    [exploit / exploit]
    ▸ metasploit exploit/multi/samba/usermap_script RHOST=10.10.10.3
    ✓ metasploit (8.7s)

    [CRITICAL]  Remote code execution — root shell obtained

  [+] Flags: d9e493...  (root)

Install

Requirements: Python 3.12+, Docker, an Anthropic API key

pip install seraph-suite

Or with uv (recommended):

uv tool install seraph-suite

Then run the one-time setup:

seraph setup

Setup will:

  • Create .env and prompt for your API key
  • Pull and start the Docker services (Qdrant, Neo4j, Redis)
  • Download and ingest the MITRE ATT&CK knowledge base

Usage

Interactive REPL

seraph

Type a target IP or hostname to start. Type anything mid-engagement to steer the agent.

  seraph> 10.10.11.42
  seraph> focus on the web service, port 80
  seraph> findings
  seraph> status
  seraph> clear
  seraph> quit

Quick-start against a target

seraph -t 10.10.10.3

HTB benchmarking

# Single machine
seraph bench --machine Lame --timeout 3600

# All Easy machines with report
seraph bench --difficulty Easy --all --report --output reports/easy.md

Knowledge base ingestion

# NVD CVE feed
seraph ingest nvd --year 2024

# MITRE ATT&CK (auto-downloads the STIX bundle)
seraph ingest mitre --download

# ExploitDB (clone the mirror first)
git clone https://gitlab.com/exploit-database/exploitdb ./data/exploitdb
seraph ingest exploitdb

# Your own CTF writeups (Markdown)
seraph ingest writeups ./data/writeups/

# Check ingestion stats
seraph ingest stats

Sandbox isolation

Run all tool invocations inside isolated Docker containers (Manus-style):

SANDBOX_ENABLED=true seraph -t 10.10.10.3

# Pre-build the agent image
make sandbox-build

How it works

 You
  │   type target / instruction
  ▼
 Orchestrator  ──── Claude Opus (planning)
  │
  ├── Recon Agent    → nmap, gobuster, curl
  ├── Exploit Agent  → metasploit, sqlmap, hydra
  ├── Privesc Agent  → linpeas, sudo checks, SUID
  ├── CTF Agent      → flag hunting, stego, web challenges
  └── Memorist       → logs which KB docs helped
         │
         ▼
  Knowledge Base
  ├── Qdrant   (BM25 + dense hybrid search, RRF fusion)
  ├── Neo4j    (MITRE ATT&CK graph, CVE → technique links)
  └── SQLite   (sessions, feedback, ingestion state)
         │
         ▼
  Self-learning loop
  └── feedback → hard negatives → triplets → LoRA fine-tune

Retrieval pipeline — every KB query runs:

  1. BM25 sparse search (exact CVE IDs, tool names)
  2. Dense semantic search (nomic-embed-text-v1.5, local)
  3. RRF fusion
  4. Neo4j graph traversal (expands CVE → linked techniques)
  5. Cross-encoder reranking (bge-reranker-v2-m3, local)

All embeddings are computed locally — no API calls for embeddings.


Configuration

All settings come from .env. Copy .env.example to get started.

Variable Default Description
ANTHROPIC_API_KEY Required. Your Anthropic key
QDRANT_URL http://localhost:6333 Qdrant vector store
NEO4J_URI bolt://localhost:7687 Neo4j graph store
NEO4J_PASSWORD seraph_secret Neo4j password
REDIS_URL redis://localhost:6379/0 Celery broker
SANDBOX_ENABLED false Docker tool isolation
DENSE_EMBEDDING_MODEL nomic-ai/nomic-embed-text-v1.5 Local embedding model
RERANKER_MODEL BAAI/bge-reranker-v2-m3 Local reranker model
LOG_LEVEL INFO Log verbosity

Services can be managed with:

make up      # start Qdrant + Neo4j + Redis
make down    # stop all services
make dev     # start with dev overrides

Agents

Agent What it does Tools
Orchestrator Plans phases, dispatches sub-agents
Recon Port scanning, service fingerprinting nmap, gobuster, curl
Exploit CVE matching, initial access metasploit, sqlmap, hydra
Privesc Privilege escalation linpeas, custom checks
CTF Flag hunting, stego, web challenges gobuster, curl
Memorist Logs KB feedback for self-learning

When more than 20 tools are available, agents use RAG-based tool selection instead of passing all tools to the LLM.


Self-learning

Every engagement makes Seraph better:

  1. Memorist logs which retrieved documents the LLM cited vs ignored
  2. Hard negatives mined from keyword-similar but semantically wrong retrievals
  3. Triplets (query, positive, negative) accumulated in SQLite
  4. LoRA adapter trained on nomic-embed-text-v1.5 when enough triplets accumulate
  5. Projection layer applied at query time — no need to re-embed the entire corpus

Retrieval quality improves measurably after ~50 engagements on similar machine classes.


Testing

# Unit tests (no services needed)
make test-unit

# All tests + coverage report
make test

# Integration tests (requires services running)
make up && make test-integration

# Sandbox tests (requires Docker + agent image)
make sandbox-build && make sandbox-test

Coverage is enforced at 80%+.


Dashboard

A FastAPI + React 18 dashboard is available:

make api-dev        # API at http://localhost:8000/docs
make dashboard-dev  # UI  at http://localhost:5173

Contributing

Issues and PRs are welcome. Please open an issue before a large PR to align on direction.

  1. Fork and create a branch
  2. Type hints everywhere, Pydantic v2, async I/O, structlog
  3. Write tests first — 80% coverage minimum
  4. make lint && make test-unit before pushing

License

MIT — see LICENSE.


Built by Maciej · Powered by Anthropic Claude

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seraph_suite-1.0.4.tar.gz (454.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seraph_suite-1.0.4-py3-none-any.whl (172.6 kB view details)

Uploaded Python 3

File details

Details for the file seraph_suite-1.0.4.tar.gz.

File metadata

  • Download URL: seraph_suite-1.0.4.tar.gz
  • Upload date:
  • Size: 454.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for seraph_suite-1.0.4.tar.gz
Algorithm Hash digest
SHA256 f1f3d7aa8ce37a6d5fb1634af15b4d1f5917809b0547b0785a95c4ca9f67c7aa
MD5 a8c42d5be657cccac4a0487a5457a106
BLAKE2b-256 4c17a9f6740d5ec6f250db8925703509fe6bb0322768806a788522ca259c7648

See more details on using hashes here.

File details

Details for the file seraph_suite-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: seraph_suite-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 172.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for seraph_suite-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 96d5271521f72a85e131a54bccaa1f3d4306c647caab7db745e2b6b3a718b0cf
MD5 5c5e6b891fb63cbe46fccd6c9a97d7e2
BLAKE2b-256 601785a9b76ab6ffbea52155729381cf7578b49b0b3b3428692da7660bfef0f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page