Phishing Email Forensic Analyzer โ generates cyber-infographic PNGs and interactive HTML reports
Project description
๐ก๏ธ PEFA โ Phishing Email Forensic Analyzer
A Python CLI tool that converts .eml files into cyber-infographic PNGs and interactive HTML reports. PEFA performs automated forensic analysis of phishing indicators and produces a composite threat score (0โ100) backed by multiple detection engines and optional threat intelligence APIs.
โ๏ธ Click to see full report ยท ๐ Interactive HTML version ยท All sample reports
๐ More sample reports (17 total)
| Report | PNG | HTML |
|---|---|---|
| ATTENTION DEAR | ๐ผ๏ธ PNG | ๐ HTML |
| Congratulations Dear | ๐ผ๏ธ PNG | ๐ HTML |
| Dear Friend | ๐ผ๏ธ PNG | ๐ HTML |
| Dear Winner | ๐ผ๏ธ PNG | ๐ HTML |
| File | ๐ผ๏ธ PNG | ๐ HTML |
| Greetings to you | ๐ผ๏ธ PNG | ๐ HTML |
| HAPPY NEW YEAR! | ๐ผ๏ธ PNG | ๐ HTML |
| Konto-รberprรผefig (Swiss German) | ๐ผ๏ธ PNG | ๐ HTML |
| Online Bank Of Africa | ๐ผ๏ธ PNG | ๐ HTML |
| Please I Need Your Urgent Attention | ๐ผ๏ธ PNG | ๐ HTML |
| INSTRUCTION TO CREDIT YOUR ACCOUNT ($25M) | ๐ผ๏ธ PNG | ๐ HTML |
| THIS IS YOUR ATM VISA CARD | ๐ผ๏ธ PNG | ๐ HTML |
| Text or Call +1 225 463 0148 | ๐ผ๏ธ PNG | ๐ HTML |
| URGENT RESPONSE | ๐ผ๏ธ PNG | ๐ HTML |
| Votre colis est prรชt pour la livraison | ๐ผ๏ธ PNG | ๐ HTML |
| Your Funds Update! | ๐ผ๏ธ PNG | ๐ HTML |
| original_msg | ๐ผ๏ธ PNG | ๐ HTML |
โจ Features
- ๐ฏ Threat Scoring โ Weighted 0โ100 composite score across 7 categories with 5 severity levels (Clean / Low / Medium / High / Critical)
- ๐ Link Analysis โ HREF mismatches, brand lookalikes, homoglyph domains, IP-based URLs, URL shorteners, suspicious TLDs, JavaScript/data URIs
- ๐ค Sender Spoofing Detection โ Display name spoofing, Return-Path/Reply-To mismatches, domain impersonation, homoglyph characters
- โก Urgency Language Scanning โ 24 social-engineering pressure patterns, generic greeting detection, keyword density scoring
- ๐ Attachment Threat Assessment โ 40+ dangerous extensions, macro-enabled documents, double extensions, MIME mismatches, file hashing (MD5/SHA256)
- ๐ Authentication Checks โ SPF, DKIM, and DMARC validation from headers (with optional MXToolbox deep validation)
- ๐ค๏ธ Delivery Path Tracing โ Full email hop trace with IP geolocation per relay
- ๐ Domain Age Lookup โ WHOIS-based registration date and age risk assessment
- ๐ค Language Quality Analysis โ Mixed-script detection, entropy analysis, zero-width characters, irregular spacing
- ๐งฌ IOC Extraction โ Consolidated Indicators of Compromise (IPs, domains, URLs, emails, hashes) with optional enrichment
- ๐ค AI Assessment โ Optional Google Gemini analysis with verdict, confidence score, attack classification, and recommended actions
- ๐ Interactive HTML Reports โ Collapsible sections, scroll-spy navigation, copy-to-clipboard, animated threat gauge, tooltips
- ๐ Batch Processing โ Analyze entire directories of
.emlfiles with a single command - ๐ Web UI โ Browser-based upload interface with live analysis (no Playwright needed client-side)
๐ฆ Installation
pip install pefa
playwright install chromium
Or install from source:
pip install .
playwright install chromium
Requires Python 3.10+ ยท PyPI page
๐ Quick Start
# Analyze a single email โ PNG infographic
pefa input.eml
# Also generate an interactive HTML report
pefa input.eml --html
# Include Gemini AI assessment
pefa input.eml --gemini
# Skip all external API calls (fully offline)
pefa input.eml --no-api
# Batch process a directory
pefa ./emails/ -o ./reports/
# Launch the web UI
pefa --web --port 8080
Or run as a module:
python3 -m pefa input.eml
๐ Usage Examples
Analyze a single email
pefa suspicious-email.eml
This produces suspicious-email.png in the current directory โ a full-page infographic with threat score, sender analysis, link flags, authentication results, and the rendered email body.
Generate both PNG and interactive HTML
pefa suspicious-email.eml --html
Outputs two files: suspicious-email.png and suspicious-email.html. The HTML report includes collapsible sections, scroll-spy navigation, an animated threat gauge, copy-to-clipboard for IOCs, and print/download buttons.
Save output to a specific location
pefa suspicious-email.eml -o ./reports/case-42.png
pefa suspicious-email.eml -o ./reports/case-42.png --html
The -o flag sets the output path. When combined with --html, the HTML file is placed alongside the PNG.
Batch process a folder of emails
pefa ./inbox/ -o ./reports/
Analyzes every .eml file in ./inbox/ and writes reports to ./reports/. A single Playwright browser instance is reused across all files for faster processing. If -o is omitted, reports go to ./inbox/reports/.
Include AI-powered assessment
export GEMINI_API_KEY="your-key-here"
pefa suspicious-email.eml --gemini
Adds a Gemini AI section to the report with a verdict (phishing/legitimate/suspicious), confidence score, attack type classification, and recommended actions. The AI assessment can also influence the overall threat score (+25 or +50 points).
To use a different Gemini model:
pefa suspicious-email.eml --gemini --gemini-model gemini-2.5-pro
Run fully offline (no API calls)
pefa suspicious-email.eml --no-api
Skips all external lookups (IP geolocation, WHOIS, urlscan, VirusTotal, AbuseIPDB, AlienVault, MXToolbox). The analysis still runs SPF/DKIM/DMARC checks from headers, link analysis, urgency detection, and attachment scanning โ all locally.
Customize image dimensions
pefa suspicious-email.eml --width 1400 --scale 2
--width sets the viewport width in pixels (default: 1000). --scale sets the device scale factor (default: 1.5) โ higher values produce sharper images at larger file sizes.
Launch the web UI
pefa --web
Opens a browser-based drag-and-drop interface at http://localhost:8080. Upload .eml files and view interactive HTML reports directly โ no Playwright needed on the client side.
pefa --web --port 9090
pefa --web --no-api
pefa --web --gemini
The web UI respects --no-api and --gemini flags.
Combine multiple flags
# Full analysis with AI, HTML output, and high-res image
pefa suspicious-email.eml --html --gemini --width 1200 --scale 2
# Batch process offline with HTML reports
pefa ./inbox/ -o ./reports/ --html --no-api
# Run the sample emails included in the repo
pefa samples/ -o examples/ --html
Use with threat intel API keys
Set any combination of API keys to enrich reports with external intelligence:
export GEMINI_API_KEY="..." # AI assessment
export URLSCAN_API_KEY="..." # Domain reputation
export VT_API_KEY="..." # VirusTotal IOC reputation
export ABUSEIPDB_API_KEY="..." # IP abuse reports
export OTX_API_KEY="..." # AlienVault OTX threat intel
export MXTOOLBOX_API_KEY="..." # Deep email auth validation
pefa suspicious-email.eml --html --gemini
Each integration activates independently โ you don't need all keys. Missing keys are silently skipped.
โ๏ธ CLI Reference
usage: pefa [-h] [--web] [--port PORT] [-o OUTPUT] [--width WIDTH]
[--scale SCALE] [--html] [--gemini]
[--gemini-model MODEL] [--no-api]
[input]
positional arguments:
input .eml file or directory of .eml files
options:
-o, --output Output path for generated reports
--web Start browser-based web UI
--port Web server port (default: 8080)
--width Viewport width in pixels (default: 1000)
--scale Device scale factor (default: 1.5)
--html Emit interactive HTML report alongside PNG
--gemini Include Gemini AI assessment
--gemini-model Gemini model to use (default: gemini-2.5-flash)
--no-api Skip all external API lookups
๐ฏ Threat Scoring
PEFA calculates a composite threat score from 0 to 100 using weighted categories:
| Category | Max Points | What It Measures |
|---|---|---|
| ๐ Authentication | 20 | SPF, DKIM, DMARC failures |
| ๐ค Sender | 20 | Spoofing, homoglyphs, header mismatches |
| ๐ Links | 25 | HREF mismatches, brand lookalikes, IP URLs, shorteners |
| โก Urgency | 15 | Pressure language patterns, generic greetings |
| ๐ Attachments | 10 | Dangerous extensions, macros, double extensions |
| ๐ค Language | 5 | Mixed scripts, entropy anomalies, quality issues |
| ๐ Domain Age | 10 | Newly registered or young domains |
Passing all authentication checks and having an established domain (3+ years) applies negative scoring. Gemini AI verdicts can add up to +50 additional points.
Threat Levels:
| Level | Score |
|---|---|
| ๐ด Critical | 70โ100 |
| ๐ High | 45โ69 |
| ๐ก Medium | 25โ44 |
| ๐ข Low | 10โ24 |
| โช Clean | 0โ9 |
๐ API Integrations
All API integrations are optional. PEFA works fully offline with --no-api. Each integration checks for its own environment variable and silently skips if unavailable. No API key is required to run a basic analysis โ PEFA performs link analysis, urgency detection, sender spoofing checks, attachment scanning, authentication header parsing, and threat scoring entirely locally.
Overview
| Service | Environment Variable | Free? | What It Adds to Reports |
|---|---|---|---|
| ๐ค Google Gemini | GEMINI_API_KEY |
Free tier available | AI verdict, attack classification, recommended actions |
| ๐ urlscan.io | URLSCAN_API_KEY |
Free tier available | URL/domain reputation verdicts |
| ๐ง MXToolbox | MXTOOLBOX_API_KEY |
Paid | Deep SPF/DKIM/DMARC validation against live DNS |
| ๐ฆ VirusTotal | VT_API_KEY |
Free tier available | IOC reputation (IPs, domains, URLs, file hashes) |
| ๐จ AbuseIPDB | ABUSEIPDB_API_KEY |
Free tier available | IP abuse confidence scores and report counts |
| ๐ฝ AlienVault OTX | OTX_API_KEY |
Free | Threat intelligence pulse counts and reputation |
| ๐ ip-api.com | (none) | Free | IP geolocation for delivery path hops |
| ๐ WHOIS | (none) | Free | Domain registration age and registrar info |
Getting API Keys
Google Gemini
Sign up at Google AI Studio to get a free API key. The free tier provides generous rate limits suitable for individual use.
export GEMINI_API_KEY="your-key-here"
Gemini provides an AI-powered phishing assessment that includes a verdict (phishing / suspicious / legitimate), confidence percentage, executive summary, technical analysis, attack type classification, and recommended actions. It can also boost the threat score by up to +50 points.
# Use default model (gemini-2.5-flash)
pefa email.eml --gemini
# Use a more capable model
pefa email.eml --gemini --gemini-model gemini-2.5-pro
Note: The
--geminiflag is required to activate AI analysis even ifGEMINI_API_KEYis set. This keeps AI calls explicit.
urlscan.io
Sign up at urlscan.io for a free account. Navigate to your profile to find your API key.
export URLSCAN_API_KEY="your-key-here"
When suspicious links are detected, PEFA queries urlscan.io for domain reputation data including overall verdict (malicious/suspicious/benign), page metadata, and redirect statistics. Results link directly to the urlscan.io result page for manual investigation.
MXToolbox
Sign up at MXToolbox for an API subscription.
export MXTOOLBOX_API_KEY="your-key-here"
Performs live DNS-based validation of SPF, DKIM, and DMARC records for the sender's domain. This goes beyond parsing email headers โ it checks the actual DNS configuration. If MXToolbox results contradict the header claims (e.g., headers say DKIM pass but DNS shows a failure), PEFA flags the discrepancy as a warning.
VirusTotal
Sign up at VirusTotal for a free community account. Your API key is available on your profile page.
export VT_API_KEY="your-key-here"
Enriches extracted IOCs with multi-vendor detection results:
- IPs (up to 5) โ malicious/suspicious/harmless detection counts and reputation score
- Domains (up to 5) โ same detection breakdown plus reputation
- URLs (up to 3) โ vendor detection counts
- File hashes (up to 5) โ detection counts and meaningful filenames
Free tier: 4 requests/minute, 500 requests/day, 15.5K requests/month.
AbuseIPDB
Sign up at AbuseIPDB for a free account.
export ABUSEIPDB_API_KEY="your-key-here"
Checks IP addresses (up to 5) against AbuseIPDB's crowd-sourced abuse report database. Returns an abuse confidence score (0โ100), total number of reports, whitelist status, country, and ISP. Queries cover reports from the last 90 days.
Free tier: 1,000 checks/day.
AlienVault OTX
Sign up at AlienVault OTX for a free account.
export OTX_API_KEY="your-key-here"
Queries the Open Threat Exchange for community-sourced threat intelligence. Returns pulse counts (how many threat intelligence reports reference the IOC) and reputation scores for:
- IPs (up to 5)
- Domains (up to 5)
- URLs (up to 3)
- File hashes (up to 5)
Setting Up All API Keys
For maximum enrichment, configure all keys in your shell profile (~/.bashrc, ~/.zshrc, etc.):
# Required: set --gemini flag to activate
export GEMINI_API_KEY="your-gemini-key"
# Threat intelligence (activate automatically when set)
export URLSCAN_API_KEY="your-urlscan-key"
export VT_API_KEY="your-virustotal-key"
export ABUSEIPDB_API_KEY="your-abuseipdb-key"
export OTX_API_KEY="your-alienvault-key"
# Email authentication
export MXTOOLBOX_API_KEY="your-mxtoolbox-key"
Then run with full enrichment:
pefa email.eml --html --gemini
API Usage Examples
# Fully offline โ no API calls at all
pefa email.eml --no-api
# Basic analysis with free APIs only (ip-api.com + WHOIS)
# No env vars needed
pefa email.eml
# Add AI assessment only
export GEMINI_API_KEY="..."
pefa email.eml --gemini
# IOC enrichment with VirusTotal + AbuseIPDB
export VT_API_KEY="..."
export ABUSEIPDB_API_KEY="..."
pefa email.eml --html
# Full enrichment: all APIs + AI + HTML report
export GEMINI_API_KEY="..."
export VT_API_KEY="..."
export ABUSEIPDB_API_KEY="..."
export OTX_API_KEY="..."
export URLSCAN_API_KEY="..."
export MXTOOLBOX_API_KEY="..."
pefa email.eml --html --gemini
# Batch process with full enrichment
pefa ./emails/ -o ./reports/ --html --gemini
How APIs Affect the Report
Without any API keys, PEFA still performs:
- Header-based SPF/DKIM/DMARC checks
- Link analysis (mismatches, brand impersonation, homoglyphs, suspicious TLDs)
- Sender spoofing detection
- Urgency language scanning
- Attachment threat assessment
- Language quality analysis
- Threat scoring (0โ100)
Adding API keys progressively enriches the report:
| APIs Configured | Additional Report Sections |
|---|---|
| (none) | Base analysis with all local checks |
+ GEMINI_API_KEY |
AI Assessment panel with verdict, confidence, attack classification |
+ VT_API_KEY |
IOC table with VirusTotal detection counts per indicator |
+ ABUSEIPDB_API_KEY |
IP abuse confidence scores in IOC table |
+ OTX_API_KEY |
Threat intelligence pulse counts in IOC table |
+ URLSCAN_API_KEY |
URL reputation verdicts in link analysis section |
+ MXTOOLBOX_API_KEY |
Deep DNS validation results in authentication section |
๐๏ธ Architecture
.eml file โ parser.py โ pipeline.run_analysis() โ PageRenderer.build() โ Playwright โ .png/.html
pefa/
โโโ cli.py # CLI argument parsing and entry point
โโโ parser.py # .eml parsing and header extraction
โโโ pipeline.py # Analysis orchestrator
โโโ scoring.py # Weighted threat score calculation
โโโ highlighting.py # Email body highlighting (urgency keywords, suspicious links)
โโโ constants.py # TLDs, shorteners, extensions, regex patterns, homoglyphs
โโโ deps.py # Centralized optional dependency imports
โโโ analyzers/
โ โโโ links.py # LinkAnalyzer โ URL and domain analysis
โ โโโ sender.py # SenderAnalyzer โ spoofing and impersonation
โ โโโ urgency.py # UrgencyAnalyzer โ pressure language patterns
โ โโโ attachments.py # AttachmentAnalyzer โ file threat assessment
โ โโโ language.py # LanguageAnalyzer โ text quality and encoding
โ โโโ ioc_consolidator.py # IOC extraction and enrichment
โโโ api/
โ โโโ ip_lookup.py # IP geolocation (ip-api.com)
โ โโโ gemini.py # Google Gemini AI assessment
โ โโโ urlscan.py # urlscan.io domain reputation
โ โโโ mxtoolbox.py # SPF/DKIM/DMARC validation
โ โโโ whois_client.py # Domain WHOIS lookup
โ โโโ virustotal.py # VirusTotal IOC lookup
โ โโโ abuseipdb.py # AbuseIPDB IP reputation
โ โโโ alienvault.py # AlienVault OTX intelligence
โโโ renderers/
โ โโโ page.py # Full HTML page assembly
โ โโโ widgets/ # 13 analysis section widgets
โโโ templates/
โโโ css/ # Dark theme, interactive styling
โโโ js/ # Section navigation, animations, interactivity
๐ค Output
๐ผ๏ธ PNG mode (default) produces a single infographic image containing all analysis sections: threat gauge, sender analysis, authentication status, link flags, urgency patterns, attachments, domain age, delivery path, IP geolocation, and the rendered email body in a sandboxed frame.
๐ HTML mode (--html) additionally produces an interactive report with collapsible sections, scroll-spy navigation, animated gauges, copy-to-clipboard for IOCs, and download/print buttons.
๐ Web UI (--web) serves a browser-based interface for uploading .eml files and viewing analysis results interactively without needing Playwright installed on the client.
๐งช Sample Emails
The samples/ directory contains example phishing emails (419 scams, social engineering, impersonation) for testing. Pre-generated reports are available in examples/.
pefa samples/
๐ License
See pyproject.toml for package metadata.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pefa-1.0.2.tar.gz.
File metadata
- Download URL: pefa-1.0.2.tar.gz
- Upload date:
- Size: 75.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfeaf4514678525da4739aecb1e708026860819bce9db18fd25c0b18dec32aa8
|
|
| MD5 |
d4a905a82780500662abb51f0780fc6f
|
|
| BLAKE2b-256 |
da4d7ea94c7178a4b6574f1eabca8384d9e9ff94a3d4ba9873eb7d590534e867
|
Provenance
The following attestation bundles were made for pefa-1.0.2.tar.gz:
Publisher:
publish.yml on CHA0S-CORP/PEFA
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pefa-1.0.2.tar.gz -
Subject digest:
bfeaf4514678525da4739aecb1e708026860819bce9db18fd25c0b18dec32aa8 - Sigstore transparency entry: 927325347
- Sigstore integration time:
-
Permalink:
CHA0S-CORP/PEFA@ebe02e8096c1ed8aab779b30032cbc32e04b3b6b -
Branch / Tag:
refs/tags/1.0.2 - Owner: https://github.com/CHA0S-CORP
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ebe02e8096c1ed8aab779b30032cbc32e04b3b6b -
Trigger Event:
release
-
Statement type:
File details
Details for the file pefa-1.0.2-py3-none-any.whl.
File metadata
- Download URL: pefa-1.0.2-py3-none-any.whl
- Upload date:
- Size: 85.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
150a6e9d75b3437d83dccd3332c5ec963529c0d235c2b3bab72aecbd853baf9c
|
|
| MD5 |
10f0964599daca116393960d25ea2073
|
|
| BLAKE2b-256 |
9ec46ccf0032b1919a8fae872ed4c524aa53268f11618ab50c58374ec1f4969b
|
Provenance
The following attestation bundles were made for pefa-1.0.2-py3-none-any.whl:
Publisher:
publish.yml on CHA0S-CORP/PEFA
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pefa-1.0.2-py3-none-any.whl -
Subject digest:
150a6e9d75b3437d83dccd3332c5ec963529c0d235c2b3bab72aecbd853baf9c - Sigstore transparency entry: 927325353
- Sigstore integration time:
-
Permalink:
CHA0S-CORP/PEFA@ebe02e8096c1ed8aab779b30032cbc32e04b3b6b -
Branch / Tag:
refs/tags/1.0.2 - Owner: https://github.com/CHA0S-CORP
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ebe02e8096c1ed8aab779b30032cbc32e04b3b6b -
Trigger Event:
release
-
Statement type: