Skip to main content

No project description provided

Project description

Logo

A blazingly fast domain extraction library written in Rust

license Python Build PyPi

Table of Contents

About The Project

PyDomainExtractor is a Python library designed to parse domain names quickly. In order to achieve the highest performance possible, the library was written in Rust.

Built With

Performance

Extract From Domain

Tests were run on a file containing 10 million random domains from various top-level domains (Mar. 13rd 2022)

Library Function Time
PyDomainExtractor pydomainextractor.extract 1.50s
publicsuffix2 publicsuffix2.get_sld 9.92s
tldextract __call__ 29.23s
tld tld.parse_tld 34.48s

Extract From URL

The test was conducted on a file containing 1 million random urls (Mar. 13rd 2022)

Library Function Time
PyDomainExtractor pydomainextractor.extract_from_url 2.24s
publicsuffix2 publicsuffix2.get_sld 10.84s
tldextract __call__ 36.04s
tld tld.parse_tld 57.87s

Installation

pip3 install PyDomainExtractor

Usage

Extraction

import pydomainextractor


# Loads the current supplied version of PublicSuffixList from the repository. Does not download any data.
domain_extractor = pydomainextractor.DomainExtractor()

domain_extractor.extract('google.com')
>>> {
>>>     'subdomain': '',
>>>     'domain': 'google',
>>>     'suffix': 'com'
>>> }

# Loads a custom SuffixList data. Should follow PublicSuffixList's format.
domain_extractor = pydomainextractor.DomainExtractor(
    'tld\n'
    'custom.tld\n'
)

domain_extractor.extract('google.com')
>>> {
>>>     'subdomain': 'google',
>>>     'domain': 'com',
>>>     'suffix': ''
>>> }

domain_extractor.extract('google.custom.tld')
>>> {
>>>     'subdomain': '',
>>>     'domain': 'google',
>>>     'suffix': 'custom.tld'
>>> }

URL Extraction

import pydomainextractor


# Loads the current supplied version of PublicSuffixList from the repository. Does not download any data.
domain_extractor = pydomainextractor.DomainExtractor()

domain_extractor.extract_from_url('http://google.com/')
>>> {
>>>     'subdomain': '',
>>>     'domain': 'google',
>>>     'suffix': 'com'
>>> }

Validation

import pydomainextractor


# Loads the current supplied version of PublicSuffixList from the repository. Does not download any data.
domain_extractor = pydomainextractor.DomainExtractor()

domain_extractor.is_valid_domain('google.com')
>>> True

domain_extractor.is_valid_domain('domain.اتصالات')
>>> True

domain_extractor.is_valid_domain('xn--mgbaakc7dvf.xn--mgbaakc7dvf')
>>> True

domain_extractor.is_valid_domain('domain-.com')
>>> False

domain_extractor.is_valid_domain('-sub.domain.com')
>>> False

domain_extractor.is_valid_domain('\xF0\x9F\x98\x81nonalphanum.com')
>>> False

TLDs List

import pydomainextractor


# Loads the current supplied version of PublicSuffixList from the repository. Does not download any data.
domain_extractor = pydomainextractor.DomainExtractor()

domain_extractor.get_tld_list()
>>> [
>>>     'bostik',
>>>     'backyards.banzaicloud.io',
>>>     'biz.bb',
>>>     ...
>>> ]

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Gal Ben David - gal@intsights.com

Project Link: https://github.com/Intsights/PyDomainExtractor

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pydomainextractor-0.13.9-cp311-none-win_amd64.whl (328.5 kB view details)

Uploaded CPython 3.11 Windows x86-64

pydomainextractor-0.13.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (447.0 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pydomainextractor-0.13.9-cp311-cp311-macosx_11_0_arm64.whl (391.6 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

pydomainextractor-0.13.9-cp311-cp311-macosx_10_12_x86_64.whl (400.7 kB view details)

Uploaded CPython 3.11 macOS 10.12+ x86-64

pydomainextractor-0.13.9-cp310-none-win_amd64.whl (328.5 kB view details)

Uploaded CPython 3.10 Windows x86-64

pydomainextractor-0.13.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (447.0 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pydomainextractor-0.13.9-cp310-cp310-macosx_11_0_arm64.whl (391.6 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

pydomainextractor-0.13.9-cp310-cp310-macosx_10_12_x86_64.whl (400.7 kB view details)

Uploaded CPython 3.10 macOS 10.12+ x86-64

pydomainextractor-0.13.9-cp39-none-win_amd64.whl (328.5 kB view details)

Uploaded CPython 3.9 Windows x86-64

pydomainextractor-0.13.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (447.0 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

pydomainextractor-0.13.9-cp39-cp39-macosx_11_0_arm64.whl (391.7 kB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

pydomainextractor-0.13.9-cp39-cp39-macosx_10_12_x86_64.whl (400.7 kB view details)

Uploaded CPython 3.9 macOS 10.12+ x86-64

pydomainextractor-0.13.9-cp38-none-win_amd64.whl (328.4 kB view details)

Uploaded CPython 3.8 Windows x86-64

pydomainextractor-0.13.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (447.1 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

pydomainextractor-0.13.9-cp38-cp38-macosx_11_0_arm64.whl (392.0 kB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

pydomainextractor-0.13.9-cp38-cp38-macosx_10_12_x86_64.whl (401.0 kB view details)

Uploaded CPython 3.8 macOS 10.12+ x86-64

pydomainextractor-0.13.9-cp37-none-win_amd64.whl (328.5 kB view details)

Uploaded CPython 3.7 Windows x86-64

pydomainextractor-0.13.9-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (447.2 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

pydomainextractor-0.13.9-cp37-cp37m-macosx_11_0_arm64.whl (391.8 kB view details)

Uploaded CPython 3.7m macOS 11.0+ ARM64

pydomainextractor-0.13.9-cp37-cp37m-macosx_10_12_x86_64.whl (400.9 kB view details)

Uploaded CPython 3.7m macOS 10.12+ x86-64

File details

Details for the file pydomainextractor-0.13.9-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 78b6bf813ee0dafe5bd6c089951484efa269d947c358875ac8048b7936c6d5ea
MD5 3e66eddc587cf808a6ab25bdbe5df422
BLAKE2b-256 fe8f4d3b0653beefb78fa7671a36f925529ace2ffc5f71d2f18ebf8b9b26bc7f

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bbe662a3fc93a03891360d3ed057b101085d3dc02e8a06275770ab17cc4ce7c4
MD5 5547a507dcd0e3d9c1339c8bfc740560
BLAKE2b-256 e934cd966b4e39b8aa0bb6f49dffcd45dc03fc63da64ebd39bbc6715fea9a971

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 065bae306588c66055fbd053babd88b1f05c9bfeff39736400ab9723c764c2c1
MD5 8f5805a7b2939bcd34c160990fa191b8
BLAKE2b-256 8f9ed64861898bb4c3c83471604afa0acd330246f9d6d38a90fa14ba15274246

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 036e8fa6540621edb4656934b6325d3b47627349e30c60e3a7316b55e5b51540
MD5 554a2997d75389fba3ef9da6b41b1f4a
BLAKE2b-256 b651b1a655ccd173e443ae038b9ab8735d7497f032c2af54ebe78ef411282760

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 4a3a5f34d36656d5714987ac7eabc5619061d09814c3a58c58cfc07b380d6249
MD5 d2df509527f4fcac9bf1f71b323cb9a3
BLAKE2b-256 33d59579d4b52031d8f6306e59b40ab3eeec2d3fbb8561be87ae95c3eee13505

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 eb5c00fae76b1da7ca2d4d3390be5bd3934bb94c9c3632c98e9fa823dafd0397
MD5 9428444ffa6c3dfc2a5c59d77605396f
BLAKE2b-256 dbe6f2167fafd58e0b6f4b31248c7f8e3ca31349124b9e86ecb3bbd556728dc3

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c1938ca20ed500a749b5edb220e120c7d7ab56a3a13cebd55a583291f3c669a9
MD5 8e2d56992dbea840661883d36693f472
BLAKE2b-256 ef104fd94dd54cf6aaf8b75bb8faca0e8d8daa9e10bc94c95bce84d080e6e56f

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 90afbc4596d9b73fa5ac5a06c6023fa51518886789e6500d005e33ac030c61be
MD5 56a059e49efbeca52fdf450cdb5b67d9
BLAKE2b-256 4a84b1c9f6a3fe13d7448ff757321fb9fc4c3f0ed3450f01ece0036a82c3a2a6

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 29b2775ed4c5b095797a0a87a9686393d29b186f1f2a6f335a32248f479c96fe
MD5 66844941a9b94b2ec123695dc4a250f3
BLAKE2b-256 0a010585db7161b5a8d56f5e921a0629ee6291430e253e32387f287d4550f63d

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 549e4cc68e54b36e3c1917e75c1716c80faeb28992b730b2ddbc4bb32a9b144f
MD5 66357c7c9d30702f27cf28262f19702a
BLAKE2b-256 3f7f7a22339da44f3e3ba5450265be150a3ee7392f24630751e774fdaa039bec

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f70f90a78b86a0f9c0e058da509b22a878201864d54a4f0ba895c81ce78a5ce4
MD5 d0ca9f41bc4d0a66bf09c0a24cf2cde5
BLAKE2b-256 8ed4fd45c1f61e5a142436486487c67bbf7eb667cf3191170c66d97f3c4bbb89

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 04ad0ed46e7e1119c5b2c5bd2a12066265b8dc091d2018ab52a2503f8b946e68
MD5 05971bd320cf0cb2331ceac76bdaeba3
BLAKE2b-256 75f7b4d139113a1ece8a28002f0088606981f3e8c970a0ab1feb3c4bd62c5df9

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp38-none-win_amd64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 083e89d393b88c9ad6bf78679b9a3950f7461c6c20e1d4272e60a04782daf211
MD5 f28533da8b308a05f7def0c6abfecf27
BLAKE2b-256 2a21059c23fc266192f91744ad3f32babf512665e27a2b9ee59d942d7dd50cc2

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 58d3ef568259db8a23b51ac2127073db95378cd1c3e9827a984e3198806fe705
MD5 a1f68f3c20eef240adcb3759a189861a
BLAKE2b-256 3d927452f8b03e52454d9a2a7923cb3b6fef6c81db95fa823d8055986d0b93d5

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a3f3c4c12f165d271ccebb7af77c9011170dc4902f0897ed610351bbc42e21e5
MD5 514ba750f74f3fbc1adfbed00f8fd4a3
BLAKE2b-256 655f7b722732910ce29760a769132848f4ed505177d213f91c40d3effbd2421e

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp38-cp38-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp38-cp38-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c1e92b5a63e3785855214921e82604d45ad54427384363fbc6d66a64c34a4155
MD5 335e2ee027a10fef0c3f3847746070a4
BLAKE2b-256 10e19eef77cba5358a9802997d8ce56cbd238eee0c9bddd52627c66bb7a618a3

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp37-none-win_amd64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp37-none-win_amd64.whl
Algorithm Hash digest
SHA256 6cdf553041febb329916834e9b4190b6a392db0ae0f8100f1e371e6ff7fb618a
MD5 b1f037b4f2531037365c2413d9c61bae
BLAKE2b-256 22f96dc97dbeb7ea4916d842338e0aacceaea3dabb64b765b990dab901eae83e

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f4c0f70101ec1e865056d6651d8c7c258c70e76e8d7c58bc5ad9a7e91b468239
MD5 27da598d3447e4aa1cc15b352e558c7c
BLAKE2b-256 f2103ca132173d2bf3e8cf22fd9636b2fc442642b0557270e359128ffed914ca

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp37-cp37m-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp37-cp37m-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2faf9841d608c323446de7788a43720a4cdf5ffc90ee0d29a973bccb88340eca
MD5 6eed04c870d4392d26538d5974774582
BLAKE2b-256 9c96b6e58217dcdbc7e10eb2db82d7a00858cfc6dffe48f0885c98bece804575

See more details on using hashes here.

File details

Details for the file pydomainextractor-0.13.9-cp37-cp37m-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydomainextractor-0.13.9-cp37-cp37m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ef92abca6f5a5ae4c519fc853c72fd30d270fa04efaca1f24e8c118b5a1dad6d
MD5 95a0ec721709f6124d3fffd4582e4c14
BLAKE2b-256 4ba474d8b8889996145a13fef302342389c6fdba8ed5d7e1ed41048bfb757706

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page