Discover internet domains associated with a business entity via CT logs, RDAP, and DNS
Project description
domain-scout
Discover internet domains associated with a business entity using Certificate Transparency logs, RDAP, and DNS.
Useful for security teams, asset inventories, and M&A due diligence — where seed domains can be wrong, misspelled, or belong to a parent company.
Install
pip install domain-scout-ct # core library + CLI
pip install domain-scout-ct[api] # + REST API server
pip install domain-scout-ct[cache] # + DuckDB query cache
pip install domain-scout-ct[all] # everything
For development:
uv sync --all-groups --all-extras
Usage
CLI
# Basic usage
domain-scout --name "Guidewire Software" --location "San Mateo, CA"
# With seed domain
domain-scout --name "Palo Alto Networks" --location "Santa Clara, CA" --seed "paloaltonetworks.com"
# Multiple seeds — cross-verification boosts confidence for domains found by both
domain-scout --name "Walmart" --seed walmart.com --seed samsclub.com
# Deep mode — GeoDNS global resolution for non-resolving domains
domain-scout --name "Walmart" --seed "walmart.com" --deep
# JSON output
domain-scout --name "Acme Corp" --output json > results.json
# Verbose logging
domain-scout --name "Cloudflare" --seed "cloudflare.com" -v
REST API
# Start the API server (cache enabled by default)
domain-scout serve --port 8080
# Health check
curl http://localhost:8080/health
# Run a scan
curl -X POST http://localhost:8080/scan \
-H "Content-Type: application/json" \
-d '{"entity": {"company_name": "Walmart", "seed_domain": ["walmart.com"]}}'
# Readiness check (probes crt.sh connectivity)
curl http://localhost:8080/ready
Docker
# Build
docker build -t domain-scout-ct .
# Run API server
docker run -p 8080:8080 domain-scout-ct
# Run CLI scan
docker run domain-scout-ct scout --name "Walmart" --seed walmart.com
# Persist cache across runs
docker run -p 8080:8080 -v scout-cache:/data/cache domain-scout-ct
Cache
# Enable cache for CLI scans
domain-scout scout --name "Walmart" --seed walmart.com --cache
# View cache statistics
domain-scout cache stats
# Clear cache
domain-scout cache clear
Library
from domain_scout import Scout
result = Scout().discover(
company_name="Palo Alto Networks",
location="Santa Clara, CA",
seed_domain=["paloaltonetworks.com"],
)
for domain in result.domains:
print(f"{domain.domain:40s} {domain.confidence:.2f} {domain.sources}")
Async
import asyncio
from domain_scout import Scout, EntityInput
async def main():
scout = Scout()
result = await scout.discover_async(EntityInput(
company_name="Palo Alto Networks",
seed_domain=["paloaltonetworks.com"],
))
return result
result = asyncio.run(main())
How it works
- Seed validation — DNS-resolves the seed domain, checks RDAP registrant org and CT cert org names against the company name
- CT org search — Queries crt.sh Postgres for certificates where the Subject Organization matches the company name
- Seed expansion — Finds all SANs on certs covering the seed domain, revealing related domains (e.g., acquired companies)
- Domain guessing — Generates candidates from the company name + common TLDs, resolves them, verifies via CT
- Cross-seed verification — With multiple seeds, domains found independently by 2+ seeds get a confidence boost
- RDAP corroboration — Queries RDAP registrant org on top discovered domains, confirming ownership matches the target company
- Confidence scoring — Corroboration-level model scores each domain 0–1 based on the combination of evidence: CT org match, SAN co-occurrence, DNS resolution, RDAP registrant match, cross-seed verification, and shared infrastructure
Data sources
| Source | Method | Rate limited |
|---|---|---|
| crt.sh | Postgres (primary), JSON API (fallback) | 5 concurrent queries, 1s burst delay |
| RDAP | rdap.org universal bootstrap | Per-request |
| DNS | dnspython (8.8.8.8, 1.1.1.1) | 5 concurrent |
| Shodan GeoDNS | geonet.shodan.io (deep mode) | 3 concurrent, 0.5s delay |
Development
make install # uv sync --all-groups
make test # unit tests (mocked external calls)
make lint # ruff + mypy
make format # ruff --fix + ruff format
make check # format + lint + test
Integration tests hit real crt.sh:
make test-integration
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file domain_scout_ct-0.5.0.tar.gz.
File metadata
- Download URL: domain_scout_ct-0.5.0.tar.gz
- Upload date:
- Size: 178.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b831d58ff1b174b3ec6f9c15a9b4d1c07883c1cf6309e06a8079e1e692f440fa
|
|
| MD5 |
bc3866cfae19e29f90c07ccfd5a872f2
|
|
| BLAKE2b-256 |
78ada093510729c540b3aa84b6e04676ef35688ccf0e2cd7f2ae7f3309e5bf6f
|
Provenance
The following attestation bundles were made for domain_scout_ct-0.5.0.tar.gz:
Publisher:
release.yml on minghsuy/domain-scout
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
domain_scout_ct-0.5.0.tar.gz -
Subject digest:
b831d58ff1b174b3ec6f9c15a9b4d1c07883c1cf6309e06a8079e1e692f440fa - Sigstore transparency entry: 973322511
- Sigstore integration time:
-
Permalink:
minghsuy/domain-scout@7ced0c9cd9f09b17ebc46c17d77142225e972b22 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/minghsuy
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7ced0c9cd9f09b17ebc46c17d77142225e972b22 -
Trigger Event:
push
-
Statement type:
File details
Details for the file domain_scout_ct-0.5.0-py3-none-any.whl.
File metadata
- Download URL: domain_scout_ct-0.5.0-py3-none-any.whl
- Upload date:
- Size: 49.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2373026fb1b1675f8c71bd8315f6595776cb7274fc75fbea6c87bad93b16d1c7
|
|
| MD5 |
ab23d53c0df1533afe9945c8e121035a
|
|
| BLAKE2b-256 |
3123206f619a76a74150e2b5a78cd4346c4187c3c3b1148572f7ac496e6e5aa5
|
Provenance
The following attestation bundles were made for domain_scout_ct-0.5.0-py3-none-any.whl:
Publisher:
release.yml on minghsuy/domain-scout
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
domain_scout_ct-0.5.0-py3-none-any.whl -
Subject digest:
2373026fb1b1675f8c71bd8315f6595776cb7274fc75fbea6c87bad93b16d1c7 - Sigstore transparency entry: 973322513
- Sigstore integration time:
-
Permalink:
minghsuy/domain-scout@7ced0c9cd9f09b17ebc46c17d77142225e972b22 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/minghsuy
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7ced0c9cd9f09b17ebc46c17d77142225e972b22 -
Trigger Event:
push
-
Statement type: