Skip to main content

Normalize a URL to a standard unicode encoding

Project description

urlnorm.py

Normalize a URL to a standard unicode representation

urlnorm normalizes a URL by:

  • lowercasing the scheme and hostname
  • converting the hostname to IDN format
  • taking out default port if present (e.g., http://www.foo.com:80/)
  • collapsing the path (./, ../, etc)
  • removing the last character in the hostname if it is ‘.’
  • unquoting any % escaped characters (where possible)

Installation

pip install urlnorm

Example

>>> import urlnorm
>>> urlnorm.norm("http://xn--q-bga.com./u/u/../%72/l/")
u'http://q\xe9.com/u/r/l/'

Project details


Release history Release notifications

This version
History Node

1.1.4

History Node

1.1.3

History Node

1.1.2

History Node

1.1.1

History Node

1.1

History Node

1.0.1

History Node

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
urlnorm-1.1.4.tar.gz (4.3 kB) Copy SHA256 hash SHA256 Source None Aug 5, 2016

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page