Skip to main content

Discover internet domains associated with a business entity via CT logs, RDAP, and DNS

Project description

domain-scout

CI PyPI Docs

Discover internet domains associated with a business entity using Certificate Transparency logs, RDAP, and DNS.

Documentation | PyPI | Changelog | CTScout API

Useful for security teams, asset inventories, and M&A due diligence — where seed domains can be wrong, misspelled, or belong to a parent company.

Install

pip install domain-scout-ct            # core library + CLI
pip install domain-scout-ct[api]       # + REST API server
pip install domain-scout-ct[cache]     # + DuckDB query cache
pip install domain-scout-ct[all]       # everything

For development:

uv sync --all-groups --all-extras

Usage

CLI

# Quickest start — use CTScout API (free key from https://ctscout.dev)
export CTSCOUT_API_KEY=ds_free_...
domain-scout --name "Goldman Sachs"

# Or pass the key directly
domain-scout --name "Goldman Sachs" --api-key ds_free_...

# Basic usage (queries crt.sh directly, no API key needed)
domain-scout --name "Cloudflare" --location "San Francisco, CA"

# With seed domain
domain-scout --name "Palo Alto Networks" --location "Santa Clara, CA" --seed "paloaltonetworks.com"

# Multiple seeds — cross-verification boosts confidence for domains found by both
domain-scout --name "Walmart" --seed walmart.com --seed samsclub.com

# Deep mode — GeoDNS global resolution for non-resolving domains
domain-scout --name "Walmart" --seed "walmart.com" --deep

# JSON output
domain-scout --name "Acme Corp" --output json > results.json

# Verbose logging
domain-scout --name "Cloudflare" --seed "cloudflare.com" -v

REST API

# Start the API server (cache enabled by default)
domain-scout serve --port 8080

# Health check
curl http://localhost:8080/health

# Run a scan
curl -X POST http://localhost:8080/scan \
  -H "Content-Type: application/json" \
  -d '{"entity": {"company_name": "Walmart", "seed_domain": ["walmart.com"]}}'

# Readiness check (probes crt.sh connectivity)
curl http://localhost:8080/ready

Docker

# Build
docker build -t domain-scout-ct .

# Run API server
docker run -p 8080:8080 domain-scout-ct

# Run CLI scan
docker run domain-scout-ct scout --name "Walmart" --seed walmart.com

# Persist cache across runs
docker run -p 8080:8080 -v scout-cache:/data/cache domain-scout-ct

Cache

# Enable cache for CLI scans
domain-scout scout --name "Walmart" --seed walmart.com --cache

# View cache statistics
domain-scout cache stats

# Clear cache
domain-scout cache clear

Library

from domain_scout import Scout

result = Scout().discover(
    company_name="Palo Alto Networks",
    location="Santa Clara, CA",
    seed_domain=["paloaltonetworks.com"],
)

for domain in result.domains:
    print(f"{domain.domain:40s}  {domain.confidence:.2f}  {domain.sources}")

Async

import asyncio
from domain_scout import Scout, EntityInput

async def main():
    scout = Scout()
    result = await scout.discover_async(EntityInput(
        company_name="Palo Alto Networks",
        seed_domain=["paloaltonetworks.com"],
    ))
    return result

result = asyncio.run(main())

How it works

  1. Seed validation — DNS-resolves the seed domain, checks RDAP registrant org and CT cert org names against the company name
  2. CT org search — Queries crt.sh Postgres for certificates where the Subject Organization matches the company name
  3. Seed expansion — Finds all SANs on certs covering the seed domain, revealing related domains (e.g., acquired companies)
  4. Domain guessing — Generates candidates from the company name + common TLDs, resolves them, verifies via CT
  5. Cross-seed verification — With multiple seeds, domains found independently by 2+ seeds get a confidence boost
  6. RDAP corroboration — Queries RDAP registrant org on top discovered domains, confirming ownership matches the target company
  7. Confidence scoring — Corroboration-level model scores each domain 0–1 based on the combination of evidence: CT org match, SAN co-occurrence, DNS resolution, RDAP registrant match, cross-seed verification, and shared infrastructure

Data sources

Source Method Rate limited
crt.sh Postgres (primary), JSON API (fallback) 5 concurrent queries, 1s burst delay
RDAP rdap.org universal bootstrap Per-request
DNS dnspython (8.8.8.8, 1.1.1.1) 5 concurrent
Shodan GeoDNS geonet.shodan.io (deep mode) 3 concurrent, 0.5s delay

Development

make install      # uv sync --all-groups
make test         # unit tests (mocked external calls)
make lint         # ruff + mypy
make format       # ruff --fix + ruff format
make check        # format + lint + test

Integration tests hit real crt.sh:

make test-integration

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

domain_scout_ct-0.8.1.tar.gz (219.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

domain_scout_ct-0.8.1-py3-none-any.whl (92.1 kB view details)

Uploaded Python 3

File details

Details for the file domain_scout_ct-0.8.1.tar.gz.

File metadata

  • Download URL: domain_scout_ct-0.8.1.tar.gz
  • Upload date:
  • Size: 219.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for domain_scout_ct-0.8.1.tar.gz
Algorithm Hash digest
SHA256 1cd93e137d6b00b6f9f396eab6dcbc067a9b904562ec5d1c39749b8031d047d8
MD5 10dae571c5f309c5624d88b3000776f1
BLAKE2b-256 069a1f1dfc43afe95f22a971a1df3e87c7ea89407c9e28aee1da17462eaf7fca

See more details on using hashes here.

Provenance

The following attestation bundles were made for domain_scout_ct-0.8.1.tar.gz:

Publisher: release.yml on minghsuy/domain-scout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file domain_scout_ct-0.8.1-py3-none-any.whl.

File metadata

File hashes

Hashes for domain_scout_ct-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 47700ce1c1aaf02c556a3a7c9e896e58e89bfcec204620999022d861e4cbfc64
MD5 e7341c720af30665e99e5cf48f3930a8
BLAKE2b-256 77f30b92de39eb9111ba5411ed2d274fd377ce9bdd1db3ecaffc231fc7f5d8e5

See more details on using hashes here.

Provenance

The following attestation bundles were made for domain_scout_ct-0.8.1-py3-none-any.whl:

Publisher: release.yml on minghsuy/domain-scout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page