Skip to main content

Python implementation of the WHATWG URL Living Standard

Project description

whatwg-url

Travis Codecov PyPI

Python implementation of the WHATWG URL Living Standard.

The latest revision that this package implements of the standard is August 7th, 2018 (commit 49060c7)

Getting Started

Install the whatwg-url package using pip.

python -m pip install whatwg-url

And use the module like so:

import whatwg_url

url = whatwg_url.parse_url("https://www.google.com")
print(url)
# Url(scheme='https', hostname='www.google.com', port=None, path='', query='', fragment='')

Features

Compatibility with urllib.parse.urlparse()

import whatwg_url

parseresult = whatwg_url.urlparse("https://seth:larson@www.google.com:1234/maps?query=string#fragment")

print(parseresult.scheme)  # 'https'
print(parseresult.netloc)  # 'www.google.com:1234'
print(parseresult.userinfo)  # 'seth:larson'
print(parseresult.path)  # '/maps'
print(parseresult.params)  # ''
print(parseresult.query)  # 'query=string'
print(parseresult.fragment)  # 'fragment'
print(parseresult.username)  # 'seth'
print(parseresult.password)  # 'larson'
print(parseresult.hostname)  # 'www.google.com'
print(parseresult.port)  # 1234
print(parseresult.geturl())  # 'https://seth:larson@www.google.com:1234/maps?query=string#fragment'

URL Normalization

The WHATWG URL specification describes methods of normalizing URL inputs to usable URLs. It handles percent-encodings, default ports, paths, IPv4 and IPv6 addresses, IDNA (2008 and 2003), multiple slashes after scheme, etc.

import whatwg_url

print(whatwg_url.normalize_url("https://////www.google.com"))  # https://www.google.com
print(whatwg_url.normalize_url("https://www.google.com/dir1/../dir2"))  # https://www.google.com/dir2
print(whatwg_url.normalize_url("https://你好你好"))  # https://xn--6qqa088eba/
print(whatwg_url.normalize_url("https://0Xc0.0250.01"))  # https://192.168.0.1/

URL Validation

print(whatwg_url.is_valid_url("https://www.google.com"))  # True
print(whatwg_url.is_valid_url("https://www .google.com"))  # False

Relative URLs

HTTP redirects often contain relative URLs (via the Location header) that need to be applied to the current URL location. Specifying the base parameter allows for giving relative URLs as input and the changes be applied to a new URL object.

import whatwg_url

url = whatwg_url.parse_url("../dev?a=1#f", base="https://www.google.com/maps")
print(url.href)  # https://www.google.com/dev?a=1#f

URL Property Mutators

Modifying properties on a URL object use the parser and "state overrides" to properly mutate the URL object.

url = whatwg_url.parse_url("http://www.google.com:443")

print(url.scheme)  # 'http'
print(url.port)  # 443

url.scheme = 'https'

print(url.scheme)  # 'https'
print(url.port)  # None

"Splatable"

The module is a single file which allows for easy vendoring into projects.

License

Apache-2.0

Changelog

2018.8.26

Added

  • Added UrlParser and Url
  • Added UrlParser.parse_host()
  • Added UrlParser.parse_ipv4_host()
  • Added Url.origin
  • Added Url.authority
  • Added urlparse and urljoin to be compatible with urllib3.parse.urlparse and urllib.parse.urljoin
  • Added support for Python 2.7, 3.4, and 3.5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whatwg-url-2018.8.26.tar.gz (30.4 kB view details)

Uploaded Source

File details

Details for the file whatwg-url-2018.8.26.tar.gz.

File metadata

  • Download URL: whatwg-url-2018.8.26.tar.gz
  • Upload date:
  • Size: 30.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.6.6

File hashes

Hashes for whatwg-url-2018.8.26.tar.gz
Algorithm Hash digest
SHA256 a4d59cc99bf6ab5967f140316dd9bb4daf6cdb18581895ef423dd54f7b41f43b
MD5 4850c9eed025f946bbfd19c3f618ea2f
BLAKE2b-256 3634c001514dbe3cc0bf6022dde46a56dc21c3e3f8208036baf8ab995a3df7a3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page