Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Normalize a URL to a standard unicode encoding

Project description

urlnorm.py

Normalize a URL to a standard unicode representation

urlnorm normalizes a URL by:

  • lowercasing the scheme and hostname
  • converting the hostname to IDN format
  • taking out default port if present (e.g., http://www.foo.com:80/)
  • collapsing the path (./, ../, etc)
  • removing the last character in the hostname if it is ‘.’
  • unquoting any % escaped characters (where possible)

Installation

pip install urlnorm

Example

>>> import urlnorm
>>> urlnorm.norm("http://xn--q-bga.com./u/u/../%72/l/")
u'http://q\xe9.com/u/r/l/'

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for Sweepatic-urlnorm, version 1.1.2.5
Filename, size File type Python version Upload date Hashes
Filename, size Sweepatic-urlnorm-1.1.2.5.tar.gz (4.4 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page