Skip to main content

URL normalization for Python

Project description

url-normalize

Build Status Coverage Status

URI Normalization function:

  • Take care of IDN domains.
  • Always provide the URI scheme in lowercase characters.
  • Always provide the host, if any, in lowercase characters.
  • Only perform percent-encoding where it is essential.
  • Always use uppercase A-through-F characters when percent-encoding.
  • Prevent dot-segments appearing in non-relative URI paths.
  • For schemes that define a default authority, use an empty authority if the default is desired.
  • For schemes that define an empty path to be equivalent to a path of "/", use "/".
  • For schemes that define a port, use an empty port if the default is desired
  • All portions of the URI must be utf-8 encoded NFC from Unicode strings

Inspired by Sam Ruby's urlnorm.py

Example:

$ pip install url-normalize
Collecting url-normalize
...
Successfully installed future-0.16.0 url-normalize-1.3.3
$ python
Python 3.6.1 (default, Jul  8 2017, 05:00:20)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
> from url_normalize import url_normalize
> print(url_normalize('www.foo.com:80/foo'))
> https://www.foo.com/foo

History:

  • 1.4.3: Added LICENSE file
  • 1.4.2: Added an optional param sort_query_params (True by default)
  • 1.4.1: Added an optional param default_scheme to the url_normalize ('https' by default)
  • 1.4.0: A bit of code refactoring and cleanup
  • 1.3.3: Support empty string and double slash urls (//domain.tld)
  • 1.3.2: Same code support both Python 3 and Python 2.
  • 1.3.1: Python 3 compatibility
  • 1.2.1: PEP8, setup.py
  • 1.1.2: support for shebang (#!) urls
  • 1.1.1: using 'http' schema by default when appropriate
  • 1.1.0: added handling of IDN domains
  • 1.0.0: code pep8
  • 0.1.0: forked from Sam Ruby's urlnorm.py

License: MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for url-normalize, version 1.4.3
Filename, size File type Python version Upload date Hashes
Filename, size url_normalize-1.4.3-py2.py3-none-any.whl (6.8 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size url-normalize-1.4.3.tar.gz (6.0 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page