Skip to main content

Extract the top-level domain (TLD) from the URL given.

Project description

Extract the top level domain (TLD) from the URL given. List of TLD names is taken from Public Suffix.

Optionally raises exceptions on non-existing TLDs or silently fails (if fail_silently argument is set to True).

PyPI Version Supported Python versions Build Status MPL-1.1 OR GPL-2.0-only OR LGPL-2.1-or-later Coverage

Prerequisites

  • Python 3.6, 3.7, 3.8 and 3.9.

Support for Python 2.7 and 3.5 is available as well.

Documentation

Documentation is available on Read the Docs.

Installation

Latest stable version on PyPI:

pip install tld

Or latest stable version from GitHub:

pip install https://github.com/barseghyanartur/tld/archive/stable.tar.gz

Usage examples

In addition to examples below, see the jupyter notebook workbook file.

Get the TLD name as string from the URL given

from tld import get_tld

get_tld("http://www.google.co.uk")
# 'co.uk'

get_tld("http://www.google.idontexist", fail_silently=True)
# None

Get the TLD as an object

from tld import get_tld

res = get_tld("http://some.subdomain.google.co.uk", as_object=True)

res
# 'co.uk'

res.subdomain
# 'some.subdomain'

res.domain
# 'google'

res.tld
# 'co.uk'

res.fld
# 'google.co.uk'

res.parsed_url
# SplitResult(
#     scheme='http',
#     netloc='some.subdomain.google.co.uk',
#     path='',
#     query='',
#     fragment=''
# )

Get TLD name, ignoring the missing protocol

from tld import get_tld, get_fld

get_tld("www.google.co.uk", fix_protocol=True)
# 'co.uk'

get_fld("www.google.co.uk", fix_protocol=True)
# 'google.co.uk'

Return TLD parts as tuple

from tld import parse_tld

parse_tld('http://www.google.com')
# 'com', 'google', 'www'

Get the first level domain name as string from the URL given

from tld import get_fld

get_fld("http://www.google.co.uk")
# 'google.co.uk'

get_fld("http://www.google.idontexist", fail_silently=True)
# None

Check if some tld is a valid tld

from tld import is_tld

is_tld('co.uk)
# True

is_tld('uk')
# True

is_tld('tld.doesnotexist')
# False

is_tld('www.google.com')
# False

Update the list of TLD names

To update/sync the tld names with the most recent versions run the following from your terminal:

update-tld-names

Or simply do:

from tld.utils import update_tld_names

update_tld_names()

Note, that this will update all registered TLD source parsers (not only the list of TLD names taken from Mozilla). In order to run the update for a single parser, append uid of that parser as argument.

update-tld-names mozilla

Custom TLD parsers

By default list of TLD names is taken from Mozilla. Parsing implemented in the tld.utils.MozillaTLDSourceParser class. If you want to use another parser, subclass the tld.base.BaseTLDSourceParser, provide uid, source_url, local_path and implement the get_tld_names method. Take the tld.utils.MozillaTLDSourceParser as a good example of such implementation. You could then use get_tld (as well as other tld module functions) as shown below:

from tld import get_tld
from some.module import CustomTLDSourceParser

get_tld(
    "http://www.google.co.uk",
    parser_class=CustomTLDSourceParser
)

Custom list of TLD names

You could maintain your own custom version of the TLD names list (even multiple ones) and use them simultaneously with built in TLD names list.

You would then store them locally and provide a path to it as shown below:

from tld import get_tld
from tld.utils import BaseMozillaTLDSourceParser

class CustomBaseMozillaTLDSourceParser(BaseMozillaTLDSourceParser):

    uid: str = 'custom_mozilla'
    local_path: str = 'tests/res/effective_tld_names_custom.dat.txt'

get_tld(
    "http://www.foreverchild",
    parser_class=CustomBaseMozillaTLDSourceParser
)
# 'foreverchild'

Same goes for first level domain names:

from tld import get_fld

get_fld(
    "http://www.foreverchild",
    parser_class=CustomBaseMozillaTLDSourceParser
)
# 'www.foreverchild'

Note, that in both examples shown above, there the original TLD names file has been modified in the following way:

...
// ===BEGIN ICANN DOMAINS===

// This one actually does not exist, added for testing purposes
foreverchild
...

Free up resources

To free up memory occupied by loading of custom TLD names, use reset_tld_names function with tld_names_local_path parameter.

from tld import get_tld, reset_tld_names

# Get TLD from a custom TLD names parser
get_tld(
    "http://www.foreverchild",
    parser_class=CustomBaseMozillaTLDSourceParser
)

# Free resources occupied by the custom TLD names list
reset_tld_names("tests/res/effective_tld_names_custom.dat.txt")

Support for Python 2.7 and 3.5

As you might have noticed, typing (Python 3.6+) is extensively used in the code. However, Python 3.5 will likely be supported until it’s EOL. All modern recent versions (starting from tld 0.11.7) are fully compatible with Python 2.7 and 3.5 (just works with pip install tld).

Install from pip

pip install tld

Development tips follow:

Python 2.7

Install locally in development mode

python setup.py develop --python-tag py27

Prepare dist

./scripts/prepare_build_py27.sh

Run tests

tox -e py27

Python 3.5

Install locally in development mode

python setup.py develop --python-tag py35

Prepare dist

./scripts/prepare_build_py35.sh

Run tests

tox -e py35

Troubleshooting

If somehow domain names listed here are not recognised, make sure you have the most recent version of TLD names in your virtual environment:

update-tld-names

To update TLD names list for a single parser, specify it as an argument:

update-tld-names mozilla

Testing

Simply type:

./runtests.py

Or use tox:

tox

Or use tox to check specific env:

tox -e py38

Writing documentation

Keep the following hierarchy.

=====
title
=====

header
======

sub-header
----------

sub-sub-header
~~~~~~~~~~~~~~

sub-sub-sub-header
^^^^^^^^^^^^^^^^^^

sub-sub-sub-sub-header
++++++++++++++++++++++

sub-sub-sub-sub-sub-header
**************************

License

MPL-1.1 OR GPL-2.0-only OR LGPL-2.1-or-later

Support

For any issues contact me at the e-mail given in the Author section.

Author

Artur Barseghyan <artur.barseghyan@gmail.com>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tld-0.12.5.tar.gz (1.1 MB view details)

Uploaded Source

Built Distributions

tld-0.12.5-py39-none-any.whl (408.6 kB view details)

Uploaded Python 3.9

tld-0.12.5-py38-none-any.whl (408.6 kB view details)

Uploaded Python 3.8

tld-0.12.5-py37-none-any.whl (408.6 kB view details)

Uploaded Python 3.7

tld-0.12.5-py36-none-any.whl (408.6 kB view details)

Uploaded Python 3.6

tld-0.12.5-py35-none-any.whl (408.0 kB view details)

Uploaded Python 3.5

tld-0.12.5-py27-none-any.whl (408.2 kB view details)

Uploaded Python 2.7

File details

Details for the file tld-0.12.5.tar.gz.

File metadata

  • Download URL: tld-0.12.5.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for tld-0.12.5.tar.gz
Algorithm Hash digest
SHA256 1b63094d893657eadfd61e49580b4225ce958ca3b8013dbb9485372cde5a3434
MD5 62fa147e8c7c1ee23ee28d6db2fdb5ee
BLAKE2b-256 5c0dca2edb748fdcc8717b0441acd42a2bbd90b6c2f4c00d19d527942c234e77

See more details on using hashes here.

File details

Details for the file tld-0.12.5-py39-none-any.whl.

File metadata

  • Download URL: tld-0.12.5-py39-none-any.whl
  • Upload date:
  • Size: 408.6 kB
  • Tags: Python 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for tld-0.12.5-py39-none-any.whl
Algorithm Hash digest
SHA256 5bd36b24aeb14e766ef1e5c01b96fe89043db44a579848f716ec03c40af50a6b
MD5 0cd11fe5fec22fd82e58a86eb96d1d6b
BLAKE2b-256 55d32bde9cdeb785877743b7926197cd592c81824dbc2360ebd52d4d8e2202d0

See more details on using hashes here.

File details

Details for the file tld-0.12.5-py38-none-any.whl.

File metadata

  • Download URL: tld-0.12.5-py38-none-any.whl
  • Upload date:
  • Size: 408.6 kB
  • Tags: Python 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for tld-0.12.5-py38-none-any.whl
Algorithm Hash digest
SHA256 cf1b7af4c1d9c689ca81ea7cf3cae77d1bfd8aaa4c648b58f76a0b3d32e3f6e0
MD5 3eefd322f55f175807de9eb4e1822236
BLAKE2b-256 c70dd7d3b2d4857ca740e2bd40bc124fb0fa2a4d70afd1608ae8796119562709

See more details on using hashes here.

File details

Details for the file tld-0.12.5-py37-none-any.whl.

File metadata

  • Download URL: tld-0.12.5-py37-none-any.whl
  • Upload date:
  • Size: 408.6 kB
  • Tags: Python 3.7
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for tld-0.12.5-py37-none-any.whl
Algorithm Hash digest
SHA256 1a69b2cd4053da5377a0b27e048e97871120abf9cd7a62ff270915d0c11369d6
MD5 e5d9bc9c87a1848e6e7e99ccf5b83269
BLAKE2b-256 1f51ec8741d354a59450327be40591ef50b0ddb78bfb359fe1319003b233e5c8

See more details on using hashes here.

File details

Details for the file tld-0.12.5-py36-none-any.whl.

File metadata

  • Download URL: tld-0.12.5-py36-none-any.whl
  • Upload date:
  • Size: 408.6 kB
  • Tags: Python 3.6
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for tld-0.12.5-py36-none-any.whl
Algorithm Hash digest
SHA256 d5938730cdb9ce4b0feac4dc887d971f964dba873a74ad818f0f25c1571c6045
MD5 f95f111079285bb8c52a9fd67d89550a
BLAKE2b-256 fad825926779488726b4ba65407d851ecfada5596302c630a813752aa0fe0991

See more details on using hashes here.

File details

Details for the file tld-0.12.5-py35-none-any.whl.

File metadata

  • Download URL: tld-0.12.5-py35-none-any.whl
  • Upload date:
  • Size: 408.0 kB
  • Tags: Python 3.5
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for tld-0.12.5-py35-none-any.whl
Algorithm Hash digest
SHA256 478d9b23157c7e3e2d07b0534da3b1e61a619291b6e3f52f5a3510e43acec7e9
MD5 97c7fd3a444eefebc05ddfa4560344df
BLAKE2b-256 bbfbc4990a3f87fc3aeb7a7fee431aa161bac505cdd35952aa54007349629697

See more details on using hashes here.

File details

Details for the file tld-0.12.5-py27-none-any.whl.

File metadata

  • Download URL: tld-0.12.5-py27-none-any.whl
  • Upload date:
  • Size: 408.2 kB
  • Tags: Python 2.7
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for tld-0.12.5-py27-none-any.whl
Algorithm Hash digest
SHA256 3266e6783825a795244a0ed225126735e8121859113b0a7fc830cc49f7bbdaff
MD5 cad821e82f536805beadd4721a45a889
BLAKE2b-256 3791ed5f21e3254760a493fd6083671efcf0b528dea17bd63f26ed128e274867

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page