Skip to main content

LLM-driven attack surface discovery — find every asset from just a company name

Project description

SurfaceMap

LLM-driven attack surface discovery. Find every external asset from just a company name.

SurfaceMap combines passive OSINT techniques, DNS enumeration, HTTP probing, port scanning, cloud bucket enumeration, and LLM intelligence to build a complete map of an organization's attack surface.


Quick Start

# Install
pip install surfacemap[all]

# Set your LLM API key
export GEMINI_API_KEY="your-key-here"

# Discover everything about a company
surfacemap discover "Acme Corp" --domain acme.com --tree --json

# Or just scan a domain
surfacemap discover example.com --mindmap

Installation

# Core (CLI + discovery)
pip install surfacemap

# With API server
pip install surfacemap[api]

# With LLM intelligence
pip install surfacemap[llm]

# With Slack notifications
pip install surfacemap[notifications]

# Everything
pip install surfacemap[all]

Install from Source

git clone https://github.com/BreachLine/surfacemap.git
cd surfacemap
pip install -e ".[all]"

External Tools (Optional)

SurfaceMap works without these, but they enhance discovery:

Tool Purpose Install
dig DNS record enumeration Included with most OS
nmap Port scanning brew install nmap / apt install nmap
subfinder Passive subdomain enum go install github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest

CLI Usage

# Full discovery with tree output
surfacemap discover "Google" --domain google.com --tree

# Export to JSON and CSV
surfacemap discover example.com --json --csv --output ./results

# Generate interactive HTML mindmap
surfacemap discover "Acme Corp" -d acme.com --mindmap

# Passive recon only (no active probing)
surfacemap discover example.com --passive-only --tree

# Enable enrichment (requires VirusTotal/Shodan/GitHub keys)
surfacemap discover example.com --enrich --json

# Skip LLM analysis phase
surfacemap discover example.com --no-analysis --tree

Discover Options

Flag Short Description
--domain -d Primary domain (if target is a company name)
--output -o Output directory for results
--tree -t Display results as a rich tree in terminal
--mindmap -m Generate interactive D3.js HTML mindmap
--json -j Export results to JSON
--csv Export results to CSV
--enrich -e Enable enrichment modules (VirusTotal, Shodan, GitHub)
--passive-only Skip active probing (passive recon only)
--no-analysis Skip LLM analysis phase (risk scoring, attack paths)

Other Commands

# Show version and check for updates
surfacemap version

# Update to latest version
surfacemap update

# View all configuration settings
surfacemap config

# Change a configuration setting
surfacemap set-config SURFACEMAP_LLM_MODEL gemini-2.0-flash

# Set an API key
surfacemap set-key GEMINI_API_KEY your-key-here

# Show configured API keys
surfacemap show-keys

API Server

# Install with API support
pip install surfacemap[api]

# Start the API server
uvicorn surfacemap.api.server:app --host 0.0.0.0 --port 8000

# Start a scan
curl -X POST "http://localhost:8000/discover?target=example.com"

# Get scan results
curl "http://localhost:8000/scans/{scan_id}"

# Health check
curl "http://localhost:8000/health"

API Endpoints

Method Endpoint Description
POST /discover Start a new discovery scan
GET /scans/{id} Get scan results by ID
GET /scans List recent scans
GET /health Health check

Discovery Pipeline

SurfaceMap runs discovery in 4 phases:

Phase Description Modules
Phase 0: LLM Brainstorm AI identifies subsidiaries, domains, infrastructure, tech stack LLM Brain
Phase 1: Passive Recon DNS, subdomains, WHOIS, cert transparency, OSINT sources DNS Records, Subdomain Enum (subfinder + crt.sh + brute force + LLM), WHOIS/RDAP, ASN Discovery, Reverse DNS, Zone Transfer, Email Security (SPF/DKIM/DMARC), Cert Transparency, Wayback Machine, URLScan, HackerTarget, RapidDNS, CommonCrawl, AnubisDB, CertSpotter, SubdomainCenter, AlienVault OTX
Phase 2: Active Probing HTTP probing, port scanning, vulnerability checks HTTP Probe + Tech/CDN/WAF Detection, Port Scan (nmap), SSL/TLS Analysis, Sensitive Path Fuzzing (60+ paths), JS Analysis, CORS Check, Cookie Security, Cloud Bucket Enum (S3/Azure/GCS), Subdomain Takeover (30 providers), Shodan InternetDB, Reverse IP, IP Geolocation
Phase 3: LLM Analysis Risk scoring, attack paths, executive summary False Positive Filtering, Risk Scoring (A-F grade), Attack Path Analysis, Executive Summary, Google Dork Generation

Asset Types

Type Description
domain Root domains
subdomain Discovered subdomains
ip IP addresses
ip_range IP CIDR ranges
port Open ports
service Running services with version info
asn Autonomous system numbers
cloud_bucket S3, Azure Blob, GCS buckets
email_server MX record mail servers
email Email addresses
nameserver NS record nameservers
cdn Content delivery networks
waf Web application firewalls
certificate TLS/SSL certificates
dns_issue DNS misconfigurations (SPF/DKIM/DMARC)
github_repo GitHub repositories
social_media Social media profiles
url Discovered URLs
technology Detected technologies
subsidiary Subsidiaries and acquisitions
whois_record WHOIS registration data
sensitive_file Exposed sensitive files
api_endpoint Discovered API endpoints
secret_leak Leaked secrets in JS/HTML
cors_misconfiguration CORS misconfigurations
cookie_issue Cookie security issues

Configuration

All settings are configured via environment variables or a .env file. Use surfacemap config to view all current settings.

API Keys

Variable Description
GEMINI_API_KEY Google Gemini API key (recommended)
ANTHROPIC_API_KEY Anthropic Claude API key
OPENAI_API_KEY OpenAI API key
VIRUSTOTAL_API_KEY VirusTotal enrichment (optional)
SHODAN_API_KEY Shodan enrichment (optional)
GITHUB_TOKEN GitHub dorking (optional)
HUNTER_API_KEY Hunter.io email harvesting (optional)

LLM Settings

Variable Default Description
SURFACEMAP_LLM_PROVIDER gemini LLM provider (gemini, anthropic, or openai)
SURFACEMAP_LLM_MODEL gemini-2.5-flash LLM model name
SURFACEMAP_LLM_MAX_TOKENS 16384 Max tokens per LLM response
SURFACEMAP_LLM_TEMPERATURE 0.3 LLM temperature
SURFACEMAP_LLM_TIMEOUT 120 LLM request timeout (seconds)

Timeouts

Variable Default Description
SURFACEMAP_HTTP_TIMEOUT 15 HTTP probe timeout (seconds)
SURFACEMAP_DNS_TIMEOUT 10 DNS lookup timeout (seconds)
SURFACEMAP_OSINT_TIMEOUT 60 OSINT API timeout (seconds)
SURFACEMAP_SCAN_TIMEOUT 300 nmap scan timeout (seconds)
SURFACEMAP_SSL_TIMEOUT 10 SSL/TLS analysis timeout (seconds)

Concurrency & Limits

Variable Default Description
SURFACEMAP_MAX_PROBES 50 Concurrent HTTP probes
SURFACEMAP_MAX_DNS 200 Concurrent DNS lookups
SURFACEMAP_MAX_SUBDOMAINS 500 Maximum subdomains to enumerate
SURFACEMAP_MAX_EXTRA_DOMAINS 20 Maximum subsidiary domains to scan
SURFACEMAP_NMAP_ARGS -sV -T4 --top-ports 100 nmap arguments

Output & Notifications

Variable Default Description
SURFACEMAP_OUTPUT_DIR ./output Default output directory
SURFACEMAP_DB_PATH ./surfacemap.db SQLite database path
SURFACEMAP_SLACK_WEBHOOK Slack webhook URL
SURFACEMAP_SLACK_TOKEN Slack Bot Token
SURFACEMAP_SLACK_CHANNEL #security Slack channel

Output Formats

  • Terminal Tree — Rich tree display with color-coded statuses
  • JSON — Full scan data with metadata
  • CSV — Flat export for spreadsheet analysis
  • HTML Mindmap — Interactive D3.js force-directed graph with dark theme, zoom, drag, and tooltips
  • Mermaid — Mermaid.js mindmap diagram for embedding in docs

Architecture

surfacemap/
  core/
    config.py        — Environment-based configuration
    models.py        — Asset, ScanResult, enums
    llm.py           — Multi-provider LLM integration (Gemini/Claude/OpenAI)
  discovery/
    base.py          — DiscoveryModule ABC
    engine.py        — 4-phase orchestration engine
    dns.py           — DNS, subdomain, takeover, cloud modules
    http.py          — HTTP probe, port scan modules
    web.py           — Wayback, URLScan, RapidDNS, CommonCrawl, AnubisDB, CertSpotter, Shodan InternetDB
    osint.py         — WHOIS, ASN, reverse DNS, SSL analysis, email security
    active.py        — Sensitive paths, JS analysis, CORS, cookie security
    enrichment.py    — VirusTotal, Shodan, GitHub dorks, email harvesting
  analysis/
    risk.py          — Risk scoring engine
    narrative.py     — Attack path and executive summary generation
  cli/
    main.py          — Typer CLI application
  output/
    mindmap.py       — D3.js HTML and Mermaid export
  api/
    server.py        — FastAPI REST API
  notifications/
    slack.py         — Slack Block Kit notifications
  storage/
    db.py            — SQLite persistence with aiosqlite

License

MIT License. Copyright (c) 2026 Yash Korat.


Built by BreachLine Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

surfacemap-2.0.1.tar.gz (101.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

surfacemap-2.0.1-py3-none-any.whl (80.0 kB view details)

Uploaded Python 3

File details

Details for the file surfacemap-2.0.1.tar.gz.

File metadata

  • Download URL: surfacemap-2.0.1.tar.gz
  • Upload date:
  • Size: 101.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for surfacemap-2.0.1.tar.gz
Algorithm Hash digest
SHA256 c2f0a6925a226441d809864eab519c8d5e1b93989bece746fb366b502218b8ae
MD5 f324b4347a4a5f4dd49b558d3aaf6d7b
BLAKE2b-256 533fec1b2500b8a4173a83013a0516efe13546648c41e9527ae3811e28afbeaa

See more details on using hashes here.

File details

Details for the file surfacemap-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: surfacemap-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 80.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for surfacemap-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2de251f39edd9f1da0d9d3194752c2ec74a93d8893aac43df0d707faea787ab1
MD5 db5a204d481a1a1da68213aad0007e73
BLAKE2b-256 40ef6ffae8934e13c65678c824ff87fe128d09d959c690c1532d572e356cab8b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page