Lightweight email triage and phishing-analysis toolkit. Extracts headers, attachments, and links, applies heuristic checks, and produces structured insights.

These details have not been verified by PyPI

Project links

Project description

PhishSage

PhishSage is a lightweight phishing-analysis toolkit that parses raw emails, inspects headers, analyzes links and domains with multi-layer heuristics, and outputs structured JSON findings for fast, automated investigation

1. Core functionality

PhishSage is intentionally minimal and concentrates on these essential capabilities:

Header analysis
- Extracts normalized sender-related headers (From, Reply-To, Return-Path, Message-ID)
- Parses SPF, DKIM, and DMARC results from Authentication-Results
- Performs alignment checks across From, Reply-To, and Return-Path
- Validates Message-ID domain consistency
- Checks timestamp sanity: Date header vs first Received hop
- Looks up WHOIS domain age and flags newly registered domains
- Validates MX records for the From domain
- Queries Spamhaus DBL for sender-related domains
- Aggregates all findings into structured JSON with merged alerts
Attachment processing
- List attachments with MIME and size
- Extract attachments safely (avoid overwrites)
- Compute hashes (MD5, SHA1, SHA256)
- Optional VirusTotal scan by SHA256
- Scan attachments with YARA rules (single files, multiple files, or directories; recursive and filtered for valid .yar/.yara files)
- Verbose mode shows matched strings with offsets and hex data
Link / URL analysis
- Extracts URLs from email bodies or headers
- Detects URLs using raw IP addresses instead of domains
- Flags suspicious or uncommon top-level domains (TLDs)
- Identifies excessive or nested subdomains, ignoring trivial ones (e.g., "www")
- Recognizes shortened URLs (bit.ly, tinyurl.com, etc.)
- Calculates Shannon entropy for domain and subdomain to spot obfuscation
- Performs SSL/TLS certificate inspection (issuer, validity, domain match, expiration)
- Looks up domain age via WHOIS and flags newly registered or expiring domains
- VirusTotal URL lookup for threat intelligence
- Optional redirect-chain tracing to uncover hidden destinations
- Checks for numeric-only registrable domains
- Detects URLs hosted on free or cheap hosting platforms
- Flags URLs with excessively deep paths

2. Environment Setup

# 1. (Optional) Create and activate a virtual environment
python3 -m venv venv

# Linux / macOS
source venv/bin/activate

# Windows (PowerShell)
venv\Scripts\Activate.ps1

# 2. Install PhishSage from PyPI
pip install phishsage

# 3. (Optional) Set VirusTotal API key
export VIRUSTOTAL_API_KEY="your_virustotal_api_key"     # Linux/macOS

# Windows (PowerShell)
# setx VIRUSTOTAL_API_KEY "your_virustotal_api_key"

3. CLI Usage

PhishSage provides a command-line interface with three main modes: headers, attachment, and links. The headers and links modes output results in JSON format, while the attachment mode produces human-readable summaries only.

Main Help

phishsage -h

Output:

usage: phishsage [-h] {headers,attachment,links} ...

PhishSage

positional arguments:
  {headers,attachment,links}
    headers             Analyze email headers for anomalies or indicators
    attachment          Analyze or extract attachments
    links               Analyze links in email content

options:
  -h, --help            show this help message and exit

Header Analysis

phishsage headers -h

Options:

usage: phishsage headers [-h] -f FILE [--heuristics] [--json]

options:
  -h, --help       show this help message and exit
  -f, --file FILE  Email file to analyze (.eml)
  --heuristics     Run heuristic header analysis for anomalies
  --json           Output results in raw JSON format

Attachment Processing

phishsage attachment -h

Options:

usage: phishsage attachment -f FILE [--list] [--extract DIR] [--hash] [--scan] [--yara PATH [PATH ...]] [--yara-verbose] [--json]

options:
  -h, --help              show this help message and exit
  -f, --file FILE         Email file to analyze (.eml)
  --list                  List attachments only
  --extract DIR           Extract attachments to specified directory
  --hash                  Compute hashes (MD5, SHA1, SHA256) for each attachment
  --scan                  Check attachments against VirusTotal by SHA256
  --yara PATH [PATH ...]  Scan attachments with YARA rules. Paths can be files or directories; directories are scanned recursively for .yar/.yara files.
  --json                  Output results in raw JSON format

Link / URL Analysis

phishsage links -h

Options:

usage: phishsage links [-h] -f FILE [--extract] [--scan]  [--check-redirects | --heuristics] [--include-redirects] [--json]

options:
  -h, --help           show this help message and exit
  -f, --file FILE      Email file to analyze (.eml)
  --extract            Extract all URLs found in the email body or headers
  --scan               Submit extracted links to VirusTotal for analysis
  --check-redirects    Follow and display final redirect destinations for each URL
  --heuristics         Run phishing heuristics on extracted URLs
  --include-redirects  Include redirect chain when running heuristics (ignored if --heuristics not used)
  --json               Output results in raw JSON format

4. Configuration

PhishSage stores configuration values in the project config (config.toml) or environment variables. The main items you may safely adjust are:

VIRUSTOTAL_API_KEY — API key for VirusTotal scans.
MAX_REDIRECTS — Maximum number of redirects to follow when checking redirect chains.
THRESHOLD_YOUNG, THRESHOLD_EXPIRING — Domain age/expiry thresholds (in days). Domains younger than THRESHOLD_YOUNG or expiring within THRESHOLD_EXPIRING days are flagged as potentially suspicious.
FREE_HOSTING_PROVIDERS, SUSPICIOUS_TLDS, SHORTENERS — Heuristic lists used in URL/link analysis.
SUBDOMAIN_THRESHOLD, TRIVIAL_SUBDOMAINS — Used for subdomain heuristics to identify excessive or meaningful subdomains.
FREE_EMAIL_DOMAINS — Free email providers that may indicate disposable or less-trusted addresses.
DATE_RECEIVED_DRIFT_MINUTES — Maximum allowed difference between the Date header and the first Received hop in email headers.

Note: Only modify thresholds or heuristic lists if you understand the potential impact on false positives and overall detection accuracy.

5. Scope & Limitations

Focused functionality: PhishSage is not a full mail forensic suite. It prioritizes heuristics, quick triage, and enrichment over deep forensic analysis.
Network-dependent checks: WHOIS, VirusTotal, MX, and SSL inspections rely on external services; results may vary or fail due to connectivity issues or API limits.
Attachment processing: Currently limited to listing, extraction, hashing, and optional VirusTotal scans. Full heuristic attachment analysis will be introduced in a future release.
Output formats: JSON output is available for all modes.
Intended use: Designed for investigative support and enrichment. Not intended for automated blocking or enforcement in production email systems.
Evolving coverage: Current checks under each section are limited; additional heuristics and enhanced analyses will be added in future releases.

6. Contributing

Contributions to PhishSage are welcome! You can help improve the project by:

Adding or refining heuristic checks for headers, attachments, and links.
Expanding the lists in config.toml.
Improving parsing, normalization, or output handling.
Reporting bugs or suggesting enhancements.

Before submitting changes, please ensure they are well-tested and maintain the code’s clarity, security, and reliability. Contributions that enhance detection coverage, reduce false positives, or improve usability are particularly appreciated.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.1.0

Apr 3, 2026

2.0.0

Apr 1, 2026

1.3.0

Feb 15, 2026

1.2.1

Feb 14, 2026

1.2.0 yanked

Feb 14, 2026

1.1.0

Jan 6, 2026

This version

1.0.0.post3

Jan 4, 2026

0.0.0a1 pre-release

Nov 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phishsage-1.0.0.post3.tar.gz (32.7 kB view details)

Uploaded Jan 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

phishsage-1.0.0.post3-py3-none-any.whl (33.5 kB view details)

Uploaded Jan 4, 2026 Python 3

File details

Details for the file phishsage-1.0.0.post3.tar.gz.

File metadata

Download URL: phishsage-1.0.0.post3.tar.gz
Upload date: Jan 4, 2026
Size: 32.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for phishsage-1.0.0.post3.tar.gz
Algorithm	Hash digest
SHA256	`bf30c3e943e4d4863c4acc4db7c95cf2e1b22776b57a9e1876b76a4693820bf2`
MD5	`a770567aef37224e0c7b3f7f8a379807`
BLAKE2b-256	`3c1842ab3f3ce583f64a5424c15e0f12ba26c32f566ec8bf11003fd168fba785`

See more details on using hashes here.

File details

Details for the file phishsage-1.0.0.post3-py3-none-any.whl.

File metadata

Download URL: phishsage-1.0.0.post3-py3-none-any.whl
Upload date: Jan 4, 2026
Size: 33.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for phishsage-1.0.0.post3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0cf85f9188f057c921b543b405ed1e25b18ccb1bdfedf41db7cd2dfc1b3da4c6`
MD5	`3327816556a845931bb24252773ea233`
BLAKE2b-256	`32708e617b7dca16eb7d5b8e4d4c5f2ebf1f3aa35f294620031e4e3406e28740`

See more details on using hashes here.

phishsage 1.0.0.post3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PhishSage

1. Core functionality

2. Environment Setup

3. CLI Usage

Main Help

Header Analysis

Attachment Processing

Link / URL Analysis

4. Configuration

5. Scope & Limitations

6. Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes