Skip to main content

Normalize a URL to a standard unicode encoding

Project description

urlnorm.py

Normalize a URL to a standard unicode representation

urlnorm normalizes a URL by:

  • lowercasing the scheme and hostname

  • converting the hostname to IDN format

  • taking out default port if present (e.g., http://www.foo.com:80/)

  • collapsing the path (./, ../, etc)

  • removing the last character in the hostname if it is ‘.’

  • unquoting any % escaped characters (where possible)

Installation

pip install urlnorm

Example

>>> import urlnorm
>>> urlnorm.norm("http://xn--q-bga.com./u/u/../%72/l/")
u'http://q\xe9.com/u/r/l/'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Sweepatic-urlnorm-1.1.2.5.tar.gz (4.4 kB view details)

Uploaded Source

File details

Details for the file Sweepatic-urlnorm-1.1.2.5.tar.gz.

File metadata

  • Download URL: Sweepatic-urlnorm-1.1.2.5.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for Sweepatic-urlnorm-1.1.2.5.tar.gz
Algorithm Hash digest
SHA256 fe9c589b30f07834071a1f7fc81d166112d2fa472c2fa409b4d9e6559a47fd54
MD5 f0c998f7a620a7ed9df55805e78da5f4
BLAKE2b-256 7ad0324df90d9f64d31aea7ff3c0ef06d3fae1b5a23edcc5c6ad49ad0a3fbe25

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page