Skip to main content

openSquat - Detection of domain squatting, typosquatting, IDN homograph attacks, and phishing threats

Project description

openSquat Logo

openSquat Core

Python 3.10+ License: GPL v3 GitHub issues GitHub stars


๐Ÿ“‘ Table of Contents


๐ŸŽฏ What is openSquat?

openSquat is an Open Source Intelligence (OSINT) security tool that identifies cyber squatting threats targeting your brand or domains:

Threat Type Description
๐ŸŽฃ Phishing Fraudulent domains mimicking your brand
๐Ÿ”ค Typosquatting Domains with common typos (e.g., gooogle.com)
๐ŸŒ IDN Homograph Look-alike characters from other alphabets
๐Ÿ‘ฅ Doppelgรคnger Domains containing your brand name
๐Ÿ”€ Bitsquatting Single-bit errors in domain names

๐ŸŒŸ Featured In

"A powerful swiss army knife for brand protection" โ€” WhoisXML API Blog, August 2022

"A tool with insane power to fight typosquatting and all related types of cyber mischief." โ€” WhoisXML API Blog, August 2022

"A handy tool for collecting information on newly registered domains." โ€” ranked Top 5 phishing detection tool โ€” SOCRadar Blog, July 2022

"openSquat provides essential protection against domain squatting and phishing attacks through automated monitoring and detection." โ€” Prince Yadav, TutorialsPoint, March 2026

Academic Citation

"OpenSquat identified 103 squatting domains, 960 active phishing websites, and 53 domains with suspicious certificates." โ€” Sharma et al., Journal of Information Security and Cybercrimes Research (JISCR), Vol. 7, Issue 1, June 2024


๐Ÿ”“ Open-Core Model

openSquat follows an open-core model:

  • Core detection engine โ€” Open source and community-driven
  • Advanced capabilities โ€” Delivered through commercial intelligence services

This model enables transparency and community collaboration while supporting the scale, reliability, and operational requirements of enterprise use.


โœจ Key Features

  • ๐Ÿ“… Daily NRD feeds โ€” Automatic newly registered domain updates
  • ๐Ÿ” Similarity detection โ€” Levenshtein distance algorithm
  • ๐Ÿ”“ Three operating modes โ€” Community (free feed), Premium Feed (paid feed, same local pipeline), or Premium API (hosted lookalike service). The two Premium modes share a single openSquat API key โ€” see Premium and API Modes.
  • ๐Ÿ›ก๏ธ VirusTotal integration โ€” Check domain reputation
  • ๐ŸŒ Quad9 DNS validation โ€” Identify malicious domains
  • ๐Ÿ“œ Certificate Transparency โ€” Monitor SSL/TLS certificates
  • ๐Ÿ“Š Multiple output formats โ€” TXT, JSON, CSV

๐Ÿš€ Quick Start

Install via pip (recommended)

pip install opensquat
opensquat -k keywords.txt

Or clone the repository

git clone https://github.com/atenreiro/opensquat
cd opensquat
pip install -r requirements.txt
python3 opensquat.py -k keywords.txt

Repo users: in all the examples below, replace opensquat with python3 opensquat.py to run from a cloned checkout.


๐Ÿ“ฆ Requirements

  • Python 3.10+
  • Dependencies: confusable_homoglyphs, homoglyphs, colorama, requests, dnspython, beautifulsoup4

๐Ÿ“– Usage

Basic Commands

# Default run
opensquat

# Show all options
opensquat -h

# Use custom keywords file
opensquat -k my_keywords.txt

Validation Options

# DNS validation via Quad9
opensquat --dns

# Check Certificate Transparency logs
opensquat --ct

# Scan for open ports (80/443)
opensquat --portcheck

# Cross-reference phishing databases
opensquat --phishing results.txt

Output Formats

# Save as JSON
opensquat -o results.json -t json

# Save as CSV
opensquat -o results.csv -t csv

Confidence Levels

Level Flag Description
0 -c 0 Very high (fewer results, high accuracy)
1 -c 1 High (default)
2 -c 2 Medium
3 -c 3 Low
4 -c 4 Very low (more results, more false positives)

Note: On the API side (--api), the five confidence levels map to four fuzziness values (exact, low, auto, high) โ€” -c 3 and -c 4 both map to high. See Premium and API Modes for the full mapping and how to override with --api-fuzziness.


๐Ÿ’Ž Premium and API Modes

openSquat supports three modes. The default (Community) is unchanged โ€” existing users need no flags. The two Premium modes share a single openSquat API key; pick Premium Feed if you want the same local detection pipeline with a larger feed, or Premium API if you want server-side detection with no local feed download.

Mode Flag What it does
Community (default) (none) Downloads the free NRD feed (~100k domains/day) and runs local Levenshtein detection.
Premium Feed --premium Downloads the paid NRD feed (nrd-lite, much larger) using your openSquat API key, then runs the same local Levenshtein detection.
Premium API --api Skips local feed download. Queries the openSquat lookalike REST API per keyword and returns server-side matches.

Get an API key

Sign up at opensquat.com to get a key. The same key works for both Premium Feed (--premium) and Premium API (--api).

Provide the API key (priority order)

  1. --api-key YOUR_KEY on the command line
  2. OPENSQUAT_API_KEY environment variable
  3. api_key.txt in the current directory (one key per file, # comments allowed)

The CLI flag is visible in ps output. Prefer the env var or key file in shared environments.

Examples

# Premium Feed mode โ€” same local pipeline, larger feed
export OPENSQUAT_API_KEY=os_xxxxxxxxxxxx
opensquat -k keywords.txt --premium

# Premium API mode โ€” server-side detection per keyword
opensquat -k keywords.txt --api

# Premium API + DNS reputation check on each returned domain
opensquat -k keywords.txt --api --dns

# Premium API with JSON output grouped by keyword
opensquat -k keywords.txt --api -t json -o results.json

# Tune the Premium API search
opensquat -k keywords.txt --api --api-fuzziness high --api-history-days 7 --api-max-results 200

When --premium or --api successfully loads a key, the CLI prints a masked confirmation line so you can verify which key was picked up without leaking it:

[*] API key loaded: os_gL...L5Mb

In Premium API mode, the run summary reports the active mode, the number of API calls made, and your remaining balance with usage delta (for example, 4972 (used 4 of 4976 this run)). Per-keyword progress lines appear in the same order as your keywords file even though the calls run in parallel. Quota exhaustion (HTTP 429) returns partial results gracefully; auth errors (401) and plan errors (403) abort with a clear message.

If the backend rate-limits your request (HTTP 429 with a Retry-After header), the tool distinguishes it from quota exhaustion: you'll see a yellow [!] Rate limit hit (retry in Ns) warning instead of the red quota exhausted message, partial results are still returned, and the summary preserves your real API balance so you can see exactly how many credits you actually used. To avoid triggering rate limits on large scans, pass --api-rate-limit N to cap outbound requests per second across all workers. A value of 8 is a safe starting point for most backends.

# Throttle to 8 requests/second across all workers
opensquat -k keywords.txt --api --api-rate-limit 8

Output format recommendation

JSON is the recommended output format for Premium API mode because the API returns per-domain metadata that the other formats cannot carry as cleanly: the registered TLD, the NRD first-seen date, an IDN homograph flag, and the unicode rendering of the homograph when the domain is one.

opensquat -k keywords.txt --api -t json -o results.json

Example of the richer output in Premium API mode (trimmed):

[
  {
    "keyword": "microsoft",
    "domains": [
      {"domain": "securite-microsoft.fr", "tld": "fr", "date": "09-04-2026", "idn": false},
      {"domain": "xn--mirosoft-hw7c.com", "tld": "com", "date": "09-04-2026", "idn": true, "unicode": "miแด„rosoft.com"}
    ]
  }
]

The idn flag plus the unicode rendering let you see at a glance that xn--mirosoft-hw7c.com is actually แด„ (Latin Letter Small Capital C) impersonating the c in "microsoft" โ€” information that a plain punycode string completely hides.

CSV output is also supported and produces one row per domain with the same metadata columns, which suits analysts working in Excel or pandas:

opensquat -k keywords.txt --api -t csv -o results.csv

The CSV is written with a UTF-8 BOM so Excel on Windows correctly renders the unicode homograph column.

Community and Premium Feed modes emit the same JSON top-level shape for cross-mode consistency, but with only the domain field populated per entry โ€” the NRD feed does not carry the per-domain metadata that only the hosted API has:

[
  {
    "keyword": "microsoft",
    "domains": [
      {"domain": "mirosoft.com"},
      {"domain": "mcrosoft.net"}
    ]
  }
]

If you pass --api-key without also selecting --premium or --api, the CLI prints a one-line hint that the key will be ignored in Community mode (no silent mode-switching).

In Premium API mode, -c/--confidence is auto-mapped to API fuzziness (0โ†’exact, 1โ†’low, 2โ†’auto, 3โ†’high, 4โ†’high). Use --api-fuzziness to override.

Premium API (--api) is incompatible with --doppelganger and -d/--domains.


โš™๏ธ Configuration

Keywords File (keywords.txt)

# Lines starting with # are comments
mycompany
mybrand
myproduct

VirusTotal API Key (vt_key.txt)

To use --vt or --subdomains, add your API key:

# Get your free API key at https://www.virustotal.com
your_api_key_here

openSquat API Key (api_key.txt)

Required for --premium and --api. Create an api_key.txt file in the working directory:

# Get your key at https://opensquat.com
# Lines starting with # are ignored; the first non-comment line is used.
os_your_key_here

The CLI resolves the key in this order: --api-key flag โ†’ $OPENSQUAT_API_KEY environment variable โ†’ api_key.txt file. The env var and file methods are preferred over the CLI flag in shared environments, since CLI arguments are visible via ps.


๐Ÿค– Automation

Run daily via crontab:

# pip-installed (recommended) โ€” every day at 8 AM, feeds update ~7:30 AM UTC
0 8 * * * cd /path/to/workdir && opensquat -k keywords.txt -o results.json -t json

# Repo checkout โ€” invoke opensquat.py directly with python3
0 8 * * * cd /path/to/opensquat && python3 opensquat.py -k keywords.txt -o results.json -t json

The cd into a working directory matters if you rely on api_key.txt (resolved from the current directory) or want results.json written to a specific place.


๐Ÿ“‹ CLI Reference

Argument Default Description
-k, --keywords keywords.txt Keywords file to search
-o, --output results.txt Output filename
-t, --type txt Output format: txt, json, csv
-c, --confidence 1 Confidence level (0-4). In --api mode this is auto-mapped to fuzziness (-c 3 and -c 4 both โ†’ high).
-d, --domains โ€” Use local domain file instead of downloading
-u, --url opensquat feed URL to download domain feed
--dns โ€” Enable Quad9 DNS validation
--doppelganger โ€” Doppelganger-only mode (keyword in domain + reachability check)
--ct โ€” Search Certificate Transparency logs
--phishing โ€” Cross-reference phishing database
--subdomains โ€” Fetch subdomains via VirusTotal
--portcheck โ€” Check for open ports 80/443
--vt โ€” Validate against VirusTotal
--premium โ€” Premium Feed mode โ€” use the paid NRD feed (requires openSquat API key)
--api โ€” Premium API mode โ€” query the openSquat lookalike REST API per keyword (no local feed)
--api-key โ€” openSquat API key (or set $OPENSQUAT_API_KEY, or use api_key.txt)
--api-fuzziness (from -c) Premium API mode: exact, low, high, or auto
--api-history-days โ€” Premium API mode: NRD history window in days (clipped to plan cap)
--api-max-results โ€” Premium API mode: max results per keyword (clipped to plan cap)
--api-rate-limit (unlimited) Premium API mode: max outbound requests per second across all workers

๐Ÿค Contributing

We welcome contributions! See our Contributing Guide for details.

  • ๐Ÿ› Report bugs via GitHub Issues
  • ๐Ÿ’ก Request features by opening an issue
  • ๐Ÿ”ง Submit PRs for bug fixes or enhancements
  • ๐Ÿ“ Release notes โ€” see the CHANGELOG for what's new in each version

๐Ÿ‘ค Author

Andre Tenreiro โ€” LinkedIn ยท PGP Key


๐Ÿ“œ License

This project is licensed under the GNU GPL v3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opensquat-2.3.0.tar.gz (56.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

opensquat-2.3.0-py3-none-any.whl (51.8 kB view details)

Uploaded Python 3

File details

Details for the file opensquat-2.3.0.tar.gz.

File metadata

  • Download URL: opensquat-2.3.0.tar.gz
  • Upload date:
  • Size: 56.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for opensquat-2.3.0.tar.gz
Algorithm Hash digest
SHA256 8900c60f5bd7d832c8ce0be0ba49c6e4df29418a445fbfda120408cbaf114afd
MD5 b1312d59ab1e889e8f3af2a0498245f6
BLAKE2b-256 c7e34f7db9cebcbd5a88c78a731907d6df96b11b13ba31aba428d85a904cbe57

See more details on using hashes here.

File details

Details for the file opensquat-2.3.0-py3-none-any.whl.

File metadata

  • Download URL: opensquat-2.3.0-py3-none-any.whl
  • Upload date:
  • Size: 51.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for opensquat-2.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0499c5b676ca60e061158b0b1b0d653b2ccf3de181b699fba0f8ad415d9709ec
MD5 ec7a1bdfa2246808ab035d1dbe843ba3
BLAKE2b-256 b6f7baa847fd496974d899ad32bacba0d690d3da18aaa5a0cd25ffc9f0487482

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page