Skip to main content

publicsuffixlist implement

Project description

Notice

This module is Dropping Support for Python 2.7 and 3.4

In the upcoming version 1.0.0, support for Python 2.7 and 3.4 will be discontinued. Version 0.10.x (or auto-released versions with the .yyyymmdd suffix) will be the last to support Python 2.7.

The minimum requirement for new versions will be Python 3.5 or later.

The new version will include type hinting to enhance API stability. The updated code is currently available in the devel branch. https://github.com/ko-zu/psl/tree/devel

If you know of any users still relying on this module with Python 2.7, please comment on github issue. https://github.com/ko-zu/psl/issues/30

publicsuffixlist

Public Suffix List parser implementation for Python 2.6+/3.x.

  • Compliant with TEST DATA
  • Support IDN (unicode or punycoded).
  • Support Python2.6+ and Python 3.x
  • Shipped with built-in PSL and the updater script.
  • Written in Pure Python. No library dependencies.

Build Status PyPI version Downloads

Install

publicsuffixlist can be installed via pip or pip3.

$ sudo pip install publicsuffixlist

If you are in a bit old distributions (RHEL/CentOS6.x), you may need to update pip itself before install.

$ sudo pip install -U pip

Usage

from publicsuffixlist import PublicSuffixList

psl = PublicSuffixList()
# uses built-in PSL file

psl.publicsuffix("www.example.com")   # "com"
# longest public suffix part

psl.privatesuffix("www.example.com")  # "example.com"
# shortest domain assigned for a registrant

psl.privatesuffix("com") # None
# None if no private (non-public) part found


psl.publicsuffix("www.example.unknownnewtld") # "unknownnewtld"
# new TLDs are valid public suffix by default

psl.publicsuffix(u"www.example.香港")   # u"香港"
# accept unicode

psl.publicsuffix("www.example.xn--j6w193g") # "xn--j6w193g"
# accept punycoded IDNs by default

Latest PSL can be passed as a file like line-iterable object.

with open("latest_psl.dat", "rb") as f:
    psl = PublicSuffixList(f)

Works with both Python 2.x and 3.x.

$ python2 setup.py test
$ python3 setup.py test

Drop-in compatibility code to replace publicsuffix

# from publicsuffix import PublicSuffixList
from publicsuffixlist.compat import PublicSuffixList

psl = PublicSuffixList()
psl.suffix("www.example.com")   # return "example.com"
psl.suffix("com")               # return "" (as str, not None)

Some convenient methods available.

psl.is_private("example.com")  # True
psl.privateparts("aaa.www.example.com") # ("aaa", "www", "example.com")
psl.subdomain("aaa.www.example.com", depth=1) # "www.example.com"

Limitation

publicsuffixlist do NOT provide domain name validation. In DNS protocol, most of 8-bit characters are acceptable label of domain name. ICANN compliant registries do not accept domain names that have _ (underscore) but hostname may have. DMARC records, for example.

Users need to confirm the input is valid based on the users' context.

Partially encoded (Unicode-mixed) Punycode is not supported because of very slow Punycode en/decoding and unpredictable encoding of results. If you are not sure the input is valid Punycode or not, you should do unknowndomain.encode("idna") which is idempotence.

ICANN and private suffixes

The public suffix list contains both suffixes for ICANN domains and private suffixes. Using the flag only_icann the private suffixes can be deactivated:

>>> psl = PublicSuffixList()
>>> psl.publicsuffix("example.priv.at")
'priv.at'
>>> psl = PublicSuffixList(only_icann=True)
>>> psl.publicsuffix("example.priv.at")
'at'

License

  • This module is licensed under Mozilla Public License 2.0.
  • Public Suffix List maintained by Mozilla Foundation is licensed under Mozilla Public License 2.0.
  • PSL testcase dataset is public domain (CC0).

Development / Packaging

This module and its packaging workflow are maintained in the author's repository located at https://github.com/ko-zu/psl. A new package, which includes the latest PSL file, is automatically generated and uploaded to PyPI. The last part of the version number represents the release date, for instance, 0.10.1.20230331.

Source / Link

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

publicsuffixlist-0.10.1.20240605.tar.gz (102.3 kB view details)

Uploaded Source

Built Distribution

publicsuffixlist-0.10.1.20240605-py2.py3-none-any.whl (102.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file publicsuffixlist-0.10.1.20240605.tar.gz.

File metadata

File hashes

Hashes for publicsuffixlist-0.10.1.20240605.tar.gz
Algorithm Hash digest
SHA256 ae8ec48bd8a3beadb05fc944ebe609865d9ca362ecf2aa5c9fee2051f4a66356
MD5 6e0d4668c772a7df1481fde788518a2e
BLAKE2b-256 693aaffbc2b08973d2039c79f51bc53d731d04ee27b56c422a4543b598f685f0

See more details on using hashes here.

File details

Details for the file publicsuffixlist-0.10.1.20240605-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for publicsuffixlist-0.10.1.20240605-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 cdf988a73cbfccc8410db84b5f0161e5040d0cde539eadfc17479bba57dfda5a
MD5 2389e6731131c95aae48bef06d913b37
BLAKE2b-256 4f20bebf8bdc45aa7a520573d6e7715b8e19b07cbdab0f3e2bbf8462d1eaf47c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page