Skip to main content

Removes clutter from URLs and returns a canonicalized version

Project description

cleanurl

Removes clutter from URLs and returns a canonicalized version

Install

pip install cleanurl

or if you're using poetry:

poetry add cleanurl

Usage

By default cleanurl retuns a cleaned URL without respecting semantics. For example:

>>> import cleanurl
>>> r = cleanurl.cleanurl('https://www.xojoc.pw/blog/focus.html?utm_content=buffercf3b2&utm_medium=social&utm_source=snapchat.com&utm_campaign=buffe')
>>> r.url
'https://xojoc.pw/blog/focus'
>>> r.parsed_url
ParseResult(scheme='https', netloc='xojoc.pw', path='/blog/focus', params='', query='', fragment='')

The default parameters are useful if you want to get a canonical URL without caring if the resulting URL is still valid.

If you want to get a clean URL which is still valid call it like this:

>>> r = cleanurl.cleanurl('https://www.xojoc.pw/blog/////focus.html', respect_semantics=True)
>>> r.url
'https://xojoc.pw/blog/focus.html'

For more examples see the unit tests.

Why?

While there are some libraries that handle general cases, this library has website specific rules that more aggresivly normalize urls.

Users

Initially used for discu.eu.

Who?

cleanurl was written by Alexandru Cojocaru.

License

cleanurl is Free Software and is released as AGPLv3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cleanurl-0.1.6.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

cleanurl-0.1.6-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file cleanurl-0.1.6.tar.gz.

File metadata

  • Download URL: cleanurl-0.1.6.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.2 Linux/5.10.0-10-amd64

File hashes

Hashes for cleanurl-0.1.6.tar.gz
Algorithm Hash digest
SHA256 4ab5a5b13aa0b54d2964fd683fc945ed704924274de493d051eb53f906698124
MD5 c1d787a3d248f821e89cd6f5b87d7301
BLAKE2b-256 7f2e60878f07303f7bbcfe215241644d33c85a072aae8339623857f61d0b38c2

See more details on using hashes here.

File details

Details for the file cleanurl-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: cleanurl-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.2 Linux/5.10.0-10-amd64

File hashes

Hashes for cleanurl-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 35f845871097619aecc12feb16bd2c95235e10d31531ca83f5d9aeefef3a44e6
MD5 7362f28d025b4599c9f629af6d60b1b4
BLAKE2b-256 f7cdb33c2ae8d32c6ccfde5fda62c39ccd2a258078fa80974a18647a9a78ad0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page