Skip to main content

Advanced phishing actor attribution using Bayesian inference and graph analysis

Project description

HUNTERTRACE

HunterTrace Logo

Advanced phishing actor attribution using multi-signal Bayesian inference and infrastructure graph analysis

PyPI version Python Versions License

Current release: 1.2.3

Overview

HUNTERTRACE is an open-source phishing attribution engine that identifies the geographic origin of phishing actors through multi-signal Bayesian inference, combining 8+ orthogonal signals to bypass VPN and proxy obfuscation. Evaluated on 53 labeled emails, it achieves 52.8% country-level and 56.6% region-level accuracy — outperforming single-signal methods — with larger-scale validation ongoing.

Traditional email forensics relies on IP geolocation alone (~31% accuracy). HUNTERTRACE fuses 8+ orthogonal signals through Bayesian inference:

Signal Source VPN-Resistant
Webmail IP leaks X-Originating-IP, X-Sender-IP headers Yes
Timezone offset Date header / Received chain Yes
Language fingerprint Content-Type charset, Subject encoding Yes
Infrastructure reuse Graph centrality across campaigns Yes
Hop chain forgery Received header consistency Partial
VPN exit node mapping ASN + hosting provider classification N/A
SPF/DKIM/DMARC/ARC Authentication results (incl. ARC chain validation) Partial
Webmail provider Header fingerprinting (Gmail/Yahoo/Outlook) Yes

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    HUNTERTRACE PIPELINE                     │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Stage 1: Header Extraction (RFC 2822 parsing)              │
│      ↓                                                      │
│  Webmail IP Leak Detection (X-Originating-IP extraction)    │
│      ↓                                                      │
│  Stage 2: IP Classification (VPN/Tor/Proxy/Residential)     │
│      ↓                                                      │
│  Stage 3A: Enrichment (WHOIS, ASN, hosting provider)        │
│      ↓                                                      │
│  VPN Backtrack Analysis (12 bypass techniques)              │
│      ↓                                                      │
│  Real IP Extraction (strips proxy layers)                   │
│      ↓                                                      │
│  Stage 3B: Threat Intelligence                              │
│  Stage 3C: Correlation Analysis                             │
│      ↓                                                      │
│  Stage 4: Geolocation (city-level, IPv4 + IPv6)             │
│      ↓                                                      │
│  Stage 5: Attribution Analysis (evidence packaging)         │
│      ↓                                                      │
│  Bayesian Multi-Signal Fusion (ACI confidence scoring)      │
│      ↓                                                      │
│  Sender Classification (hop forgery + timezone analysis)    │
│      ↓                                                      │
│  Output: JSON report + text summary + attack graph HTML     │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Quick Start

Installation

pip install huntertrace

Python API

from huntertrace import HunterTrace

# Run the full 7-stage pipeline
pipeline = HunterTrace(verbose=True)
result = pipeline.run("phishing.eml")

# Generate text report
report = result.generate_report()
print(report.generate_text_report())

# Access Bayesian attribution
bayes = result.bayesian_attribution
if bayes:
    print(f"Region: {bayes.primary_region}")
    print(f"Confidence: {bayes.aci_adjusted_prob:.1%}")
    print(f"Tier: {bayes.tier}{bayes.tier_label}")

Command Line

# Single email analysis
huntertrace analyze phishing.eml --verbose

# Batch processing
huntertrace batch emails/ -o results/

# Campaign correlation (cross-email actor linking)
huntertrace campaign emails/ -o campaign_report/

Performance

Evaluated on a labeled corpus of 53 phishing emails with known ground-truth origins:

Method Top-1 Country Accuracy Notes
IP Geolocation Only ~31% Industry baseline
Timezone Only ~52% VPN-resistant, coarse
HUNTERTRACE (Bayesian) 52.8% Multi-signal fusion
HUNTERTRACE (+ Graph) 56.6% Region-level accuracy

95% Confidence Interval: 39.7% – 65.6% (n=53)
Webmail IP Leak Rate: 37.7% of analyzed emails
Coverage: 100% (no failed predictions)

⚠️ Note: Performance numbers are based on an initial corpus of 53 labeled emails. Larger-scale validation is in progress. Region-level accuracy (56.6%) is more reliable than country-level given current corpus size.

✨ Key Features

  • 🎯 Multi-Signal Attribution (8+ signals)
  • 🔓 VPN Bypass (webmail leaks, timezone)
  • 🕸️ Graph Analysis (infrastructure reuse)
  • 📊 Bayesian Fusion (probabilistic)

🚀 Quick Start

git clone https://github.com/akshaydotweb/HunterTrace.git
cd HunterTrace
pip install -r requirements.txt

# Analyze email
python hunterTrace.py analyze phishing.eml

📖 Documentation

🔬 Evaluation

Dataset: 53 labeled phishing emails
Methodology: Manual OSINT labeling with ground truth

  • Top-1 Country Accuracy: 52.8%
  • Top-1 Region Accuracy: 56.6%
  • 95% Confidence Interval: 39.7% – 65.6%
  • Webmail Leak Rate: 37.7%
  • Macro F1: 0.37

See evaluation/ for full results.

🎓 Citation

@software{huntertrace2026,
  author = {[Your Name]},
  title = {HUNTERTRACE: Multi-Signal Phishing Attribution},
  year = {2026},
  url = {https://github.com/akshaydotweb/HunterTrace}
}

📄 License

MIT License - See LICENSE


Black Hat Arsenal 2026 Submission

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

huntertrace-1.2.3.tar.gz (597.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

huntertrace-1.2.3-py3-none-any.whl (459.9 kB view details)

Uploaded Python 3

File details

Details for the file huntertrace-1.2.3.tar.gz.

File metadata

  • Download URL: huntertrace-1.2.3.tar.gz
  • Upload date:
  • Size: 597.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for huntertrace-1.2.3.tar.gz
Algorithm Hash digest
SHA256 f9f774a624bc06e8e913b41e45382142a57c4aeea77be2712934d6405a5ad60a
MD5 0b27275352ba1e485086772803b66cb3
BLAKE2b-256 774570553280852aee3b3184644f98390ba532fac09d5573f3611f0de02a5b9f

See more details on using hashes here.

File details

Details for the file huntertrace-1.2.3-py3-none-any.whl.

File metadata

  • Download URL: huntertrace-1.2.3-py3-none-any.whl
  • Upload date:
  • Size: 459.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for huntertrace-1.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 68f40d6e1bcdad8532d9bdb814da4a8f5a632063bcc9db4061ce4b0f151905dc
MD5 64a10c838cec6ff601af3259493775be
BLAKE2b-256 790f9a7f9ab1efb46456ff6c8d4613a29a0b07ff2288105ec81a48752c6728ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page