Skip to main content

A collection of util functions for extracting domains from urls.

Project description

domain_utils

https://img.shields.io/pypi/v/domain_utils.svg CircleCI Documentation Status

A collection of util functions for extracting domains from urls.

Repo: https://github.com/mozilla/domain_utils

Install:

pip install domain_utils

Use:

import domain_utils as du
# Return just the url `my.domain.cloudfront.net/a/path/to/a/file.html`
du.stem_url('https://my.domain.cloudfront.net/a/path/to/a/file.html?a=1')
# Return just the eTLD+1 `domain.cloudfront.net`
du.get_etld1('https://my.domain.cloudfront.net/a/path/to/a/file.html?a=1')
# Get the port `5000`
du.get_port('https://localhost:5000/a/path/to/a/file.html?a=1')
# Get the scheme `wss`
du.get_port('wss://somedomain.example.com/a/path/to/a/ws')

This package was originally extracted from openwpm-utils.

Community Participation Guidelines

This project is governed by Mozilla’s code of conduct and etiquette guidelines.

For more details, please read the Mozilla Community Participation Guidelines.

For more information on how to report violations of the Community Participation Guidelines, please read our How to Report page.

History

0.7.1 (2020-04-10)

Fix building on readthedocs.

0.7.0 (2020-04-10)

Thanks to new contributor @yabirgb for two PRs (#20 and #25) in this release.

API changes: #26 renamed get_stripped_url to stem_url, and get_ps_plus_1 to get_etld1. Old method names will continue to work though. #22 updated keyword arguments to get_stripped_url - default behavior is basically the same.

  • API changes (#26 and #22)

  • Support parsing ws/wss urls (#22)

  • Add get_port method (#25)

  • Add get_scheme method (#20)

  • Correct license declaration in setup.py (#24)

0.6.0 (2020-04-06)

  • Use tldextract for parsing domains (#12)

  • Use numpy style docstrings

  • Support case of no scheme and port in URL (#13)

0.5.0 (2020-04-03)

  • Remove support for python 3.5

  • Handle more cases in get_stripped_url and change default behavior:

    • handle a lack of scheme

    • boolean flag to return or not non http urls - default is to return them which is a change of behavior as previously they would not return

    • Use netloc by default instead of hostname with a boolean flag to use hostname.

0.4.0 (2020-03-25)

  • Remove py27 support

0.3.0 (2020-03-25)

  • Restore py27 support.

  • Last version with py27 support.

  • Remove tox

0.2.0 (2020-03-24)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

domain_utils-0.7.1.tar.gz (25.5 kB view details)

Uploaded Source

Built Distribution

domain_utils-0.7.1-py2.py3-none-any.whl (11.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file domain_utils-0.7.1.tar.gz.

File metadata

  • Download URL: domain_utils-0.7.1.tar.gz
  • Upload date:
  • Size: 25.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200325 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2

File hashes

Hashes for domain_utils-0.7.1.tar.gz
Algorithm Hash digest
SHA256 d0eacc4cf1985708f5183682327b647462daf2228540030333744807c45f2231
MD5 0894fe27fc58df8941e6b58116916165
BLAKE2b-256 9147de3fc7e80db115bcb44784d3288d35192bf58f5f67374a6eef9bba1e179a

See more details on using hashes here.

File details

Details for the file domain_utils-0.7.1-py2.py3-none-any.whl.

File metadata

  • Download URL: domain_utils-0.7.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200325 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2

File hashes

Hashes for domain_utils-0.7.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c3066a2db849240bcd5aebed68d9996518c71a510cb86db1444ed1c3a1baf398
MD5 59b2d91b96868255cf80e9d1819e0b3a
BLAKE2b-256 1399cc070cb6a1ad1ccfe0108284e948c36ffbb6079b0c887c927a90bdb7a34b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page