Skip to main content

Highly optimized Domain Name Extraction library written in C++

Project description


Highly optimized domain name extraction library written in C++

license Python Build PyPi

Table of Contents

About The Project

PyDomainExtractor is a library intended for parsing domain names into their parts fast. The library is written in C++ to achieve the highest performance possible.

Built With


Test was measured on a file containing 10 million random domains from various TLDs

Library Function Time
tldextract __call__ 67.0s
publicsuffix2 publicsuffix2.get_tld 25.8s
PyDomainExtractor pydomainextractor.extract 2.76s


In order to compile this package you should have GCC, libidn2, and Python development package installed.

  • Fedora
sudo dnf install python3-devel libidn2-devel gcc-c++
  • Ubuntu 18.04
sudo apt install python3-dev libidn2-dev g++-8


pip3 install PyDomainExtractor


The usual use case:

import pydomainextractor

# Loads the current supplied version of PublicSuffixList from the repository. Does not download any data.

>>> {
>>>     'subdomain': '',
>>>     'domain': 'google',
>>>     'suffix': 'com'
>>> }

# Loads a custom SuffixList data. Should follow PublicSuffixList's format.

>>> {
>>>     'subdomain': 'google',
>>>     'domain': 'com',
>>>     'suffix': ''
>>> }

>>> {
>>>     'subdomain': '',
>>>     'domain': 'google',
>>>     'suffix': 'custom.tld'
>>> }


Distributed under the MIT License. See LICENSE for more information.


Gal Ben David -

Project Link:

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for PyDomainExtractor, version 0.2.3
Filename, size File type Python version Upload date Hashes
Filename, size PyDomainExtractor-0.2.3.tar.gz (98.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page