Skip to main content

Map enterprise corporate structures to enrichable domains

Project description

Enterprise Domain Mapper

CI Python 3.10+ License: MIT

Map enterprise corporate structures to enrichable domains. Feed it company names, get back subsidiaries, acquisitions, and regional domains that your enrichment tools are missing.

Every sales team doing enterprise ABM hits the same wall: large companies have dozens of subsidiaries, acquisitions, and regional entities, each with their own email domain. Without a complete domain map, tools like Clay and Apollo only find contacts at the parent domain. Entire business units get missed.

This tool fixes that.

$ domain-mapper "HSBC"

HSBC
├── HSBC Bank USA             us.hsbc.com       [SEC EDGAR]
├── First Direct              firstdirect.com   [Wikipedia]
├── HSBC Continental Europe   hsbc.fr           [Wikipedia]
├── HSBC UK                   hsbc.co.uk        [TLD guess ✓ DNS verified]
├── HSBC Italy                hsbc.it           [TLD guess ✓ DNS verified]
└── HSBC Hong Kong            hsbc.com.hk       [TLD guess ✓ DNS verified]

Found 12 subsidiaries, 18 domains (6 confirmed, 12 guessed, 9 DNS verified)

The problem

Enterprise accounts don't operate under a single domain. A company like HSBC runs distinctly branded subsidiaries such as First Direct alongside regional banks in dozens of countries, each with its own domain. Deloitte has member firms. Nestlé sits over hundreds of consumer brands that use completely different domains.

If you're running enrichment against just hsbc.com, you're finding maybe 30% of the contacts you could be reaching. The rest are hiding behind firstdirect.com, hsbc.co.uk, hsbc.com.hk, and domains you didn't know existed.

Building these domain maps manually takes hours per account. We built this tool because we got tired of doing it by hand.

Quick start

Installation

pip install enterprise-domain-mapper

Or clone and install locally:

git clone https://github.com/gtmlayer/enterprise-domain-mapper.git
cd enterprise-domain-mapper
pip install -e .

Single company lookup

domain-mapper "Boeing"

Batch mode (CSV input)

domain-mapper accounts.csv --output results.csv

Your input CSV just needs a column with company names. The tool auto-detects columns named company_name, company, name, or account. If you have a domain column (domain, website, url), it'll use that as the parent domain for TLD guessing.

With DNS verification

domain-mapper accounts.csv --output results.csv --verify-dns

This checks whether guessed domains actually have mail infrastructure (MX records) or at minimum resolve (A records). Adds a few seconds per company but filters out the noise.

What it does

The tool combines three data sources and a verification layer to build comprehensive domain maps:

1. SEC EDGAR Exhibit 21 scraper

For US-listed companies, SEC filings include Exhibit 21: a legally required list of all subsidiaries. The tool looks up the company's CIK, finds the latest 10-K filing, and parses the subsidiary list with jurisdictions.

This is the highest-quality source - it's legally mandated disclosure, so it's comprehensive and current.

2. Wikipedia corporate structure parser

For non-US companies (or supplementary data), the tool searches Wikipedia for the company page and extracts subsidiary and acquisition data from infoboxes and structured sections.

Covers companies globally, though data depth varies by how well-maintained the Wikipedia page is.

3. TLD pattern generator

Once subsidiaries are identified with their jurisdictions, the tool generates likely domain patterns. A subsidiary in Italy with parent domain hsbc.com produces guesses like hsbc.it. Covers 70+ countries with their standard corporate TLD patterns (e.g. UK produces co.uk and .uk, Japan produces co.jp and .jp).

4. DNS verification (optional)

MX record lookup with A record fallback to confirm guessed domains actually resolve. MX records are the strongest signal - if a domain has mail infrastructure, it's real. A records confirm the domain exists even without mail setup.

Output format

Detailed output (default)

Nine columns, one row per subsidiary-domain pair:

Column Description
parent_company The company you looked up
parent_domain Known parent domain
subsidiary_name Name of the subsidiary or entity
subsidiary_type Subsidiary, acquisition, division, etc.
jurisdiction Country or region
domain Confirmed or guessed domain
domain_source Where it came from (SEC EDGAR, Wikipedia, TLD guess)
dns_verified Whether DNS verification passed
confidence High (confirmed), Medium (guessed + verified), Low (guessed only)

Clay import format

One row per company with domains consolidated into a single field, ready for direct import into Clay as a data source.

domain-mapper accounts.csv --output results.csv --format clay

Importing into Clay

  1. Run the tool with --format clay to get the Clay-optimised output
  2. In Clay, create a new table or add to an existing one
  3. Import the CSV - the columns map directly to Clay's expected format
  4. Use the domain list column with Clay's enrichment tools to find contacts across all mapped domains

This is the workflow that sparked the whole tool. We were manually building domain maps for a client's enterprise accounts and realised the process was repeatable enough to automate.

Example

The examples/ directory contains input_sample.csv with five test companies to get you started:

domain-mapper examples/input_sample.csv --output examples/results.csv --verify-dns

Contributing

Want to add a new data source? The architecture makes it straightforward:

  1. Create a new module in src/domain_mapper/sources/
  2. Implement a class with a get_subsidiaries(company_name) method that returns a list of subsidiary objects
  3. Add it to the orchestrator in mapper.py
  4. Write tests in tests/

Some data sources we'd love to see contributed:

  • Companies House (UK company registry)
  • OpenCorporates API
  • Crunchbase (acquisitions data)
  • D&B corporate hierarchies

Pull requests welcome. Run ruff check and black before submitting, and make sure pytest passes.

Tech stack

  • Python 3.10+
  • requests and beautifulsoup4 for web scraping
  • click for the CLI
  • rich for terminal output
  • No paid APIs, no API keys required

Built by GTM Layer

GTM Layer builds revenue systems for B2B sales teams. We work with companies on CRM architecture, enrichment pipelines, signal-driven outbound, and everything in between.

This tool came out of real client work - we built it to solve a problem we kept hitting on enterprise ABM engagements. If you're running into similar challenges, get in touch.

Licence

MIT - use it however you want.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enterprise_domain_mapper-0.1.2.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

enterprise_domain_mapper-0.1.2-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file enterprise_domain_mapper-0.1.2.tar.gz.

File metadata

File hashes

Hashes for enterprise_domain_mapper-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0e3245d65d0e1ba252c5c4b4311b625ce9911a28fd1073408ff5308f556deae4
MD5 c62b6a3f9f2a40964cd079ed0ecd36e0
BLAKE2b-256 77693a084a9ca5fbc5525325f328ed632b53e386591ea28d34733021fc207eea

See more details on using hashes here.

File details

Details for the file enterprise_domain_mapper-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for enterprise_domain_mapper-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f2bb386c54a353b35ec3e544182232d707e47622f9ff8008fd8bdca79bf0d3c1
MD5 c68fcd003ce7945c7bcaf477a646af71
BLAKE2b-256 ecf679e98ad419c674332751ad3e7aa38cffeba58d29bfe1da073469a78414e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page