Skip to main content

Removes clutter from URLs and returns a canonicalized version

Project description

cleanurl

Removes clutter from URLs and returns a canonicalized version

Install

pip install cleanurl

or if you're using poetry:

poetry add cleanurl

Usage

By default cleanurl retuns a cleaned URL without respecting semantics. For example:

>>> import cleanurl
>>> r = cleanurl.cleanurl('https://www.xojoc.pw/blog/focus.html?utm_content=buffercf3b2&utm_medium=social&utm_source=snapchat.com&utm_campaign=buffe')
>>> r.url
'https://xojoc.pw/blog/focus'
>>> r.parsed_url
ParseResult(scheme='https', netloc='xojoc.pw', path='/blog/focus', params='', query='', fragment='')

The default parameters are useful if you want to get a canonical URL without caring if the resulting URL is still valid.

If you want to get a clean URL which is still valid call it like this:

>>> r = cleanurl.cleanurl('https://www.xojoc.pw/blog/////focus.html', respect_semantics=True)
>>> r.url
'https://xojoc.pw/blog/focus.html'

For more examples see the unit tests.

Why?

While there are some libraries that handle general cases, this library has website specific rules that more aggresivly normalize urls.

Users

Initially used for discu.eu.

Who?

cleanurl was written by Alexandru Cojocaru.

License

cleanurl is Free Software and is released as AGPLv3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cleanurl-0.1.3.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

cleanurl-0.1.3-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file cleanurl-0.1.3.tar.gz.

File metadata

  • Download URL: cleanurl-0.1.3.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.9.2 Linux/5.10.0-10-amd64

File hashes

Hashes for cleanurl-0.1.3.tar.gz
Algorithm Hash digest
SHA256 fd372d15c615e054a45f58b68b003e36a03d7d45be91a1f2cc5cdfb61ddf7477
MD5 db731f24fb8e6cc06c81c522b2a68303
BLAKE2b-256 0f4e7db5ce76b2437509cd407537a2e538828ffdf27e74b99e5ddebc6214ebd3

See more details on using hashes here.

File details

Details for the file cleanurl-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: cleanurl-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.9.2 Linux/5.10.0-10-amd64

File hashes

Hashes for cleanurl-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8aa3d051eb66cbd315712d9c52e8e921ab5b9b87a0ce262ca58cfe0797210948
MD5 0fbd5c6eefebbdeb8a1ae7fa66a1c48d
BLAKE2b-256 3b3bbb6614ff3ad2dca92443ae967c375cb0624c52d254bf0c38b188ee887d3a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page