Pure Python implementation of the WHATWG URL Standard.
Project description
pywhatwgurl - Python WHATWG URL Parser
Pure Python implementation of the WHATWG URL Standard. The goal is a small, spec-faithful library for parsing, serializing, and manipulating URLs.
pywhatwgurl is listed as a complete Python implementation in the official WHATWG URL repository.
Status
- 100% WHATWG URL Standard conformance for core URL parsing
- Full
URLandURLSearchParamsAPI implementation - See Conformance for IDNA limitations
Installation
Requires Python 3.10+.
pip install pywhatwgurl
Quickstart
from pywhatwgurl import URL
url = URL("https://user:pass@example.com:8080/path?q=1#frag")
url.hostname # 'example.com'
url.port # '8080'
url.pathname # '/path'
url.search # '?q=1'
url.hash # '#frag'
str(url) # 'https://user:pass@example.com:8080/path?q=1#frag'
URLSearchParams works just like the browser API:
from pywhatwgurl import URLSearchParams
params = URLSearchParams("a=1&b=2&a=3")
params.get("a") # '1'
params.get_all("a") # ('1', '3')
params.set("b", "42")
str(params) # 'a=1&b=42&a=3'
For full API details, see the documentation.
Development
To set up a local development environment, use uv (recommended) or pip:
# Clone and install dev dependencies
uv sync --dev
# Run the test suite
uv run pytest
If you prefer pip:
python -m venv .venv && source .venv/bin/activate
pip install -e .
pip install pytest ruff mypy pre-commit interrogate pip-audit cyclonedx-bom
pytest
Conformance
This library targets 100% conformance with the WHATWG URL Standard for core URL parsing, validated against the official Web Platform Tests.
| Test Suite | Status |
|---|---|
| URL parsing (urltestdata) | ✅ 873/873 (100%) |
| URL setters | ✅ 274/274 (100%) |
| URL setters stripping | ✅ 144/144 (100%) |
| URLSearchParams | ✅ 139/139 (100%) |
| Percent-encoding | ✅ 7/7 (100%) |
| IDNA/ToASCII | ⚠️ xfail (see below) |
IDNA Limitations
IDNA tests are marked as expected failures (xfail) because the Python idna library follows stricter RFC 5891/5892 rules than the WHATWG URL Standard's lenient UTS46 processing.
Why not fix these?
- No Python WHATWG-compliant IDNA implementation exists
- Even non-Python implementations with custom IDNA handling still fall short of full compliance
- Real-world domains work correctly — failures are obscure edge cases
For details, see tests/conformance/README.md.
Supply Chain Security
- SBOM: A Software Bill of Materials (CycloneDX JSON) is automatically generated for every release and attached to the GitHub Release.
- Audit: All runtime dependencies are scanned for known vulnerabilities using
pip-auditin our CI pipeline.
Roadmap
- ✅ Implement URL parsing/serialization per WHATWG URL Standard
- ✅ Validate against the official URL test suite (100% conformance)
- Ship a minimal, typed API suitable for frameworks and tooling
WPT URL test data
To pull down the pinned WPT URL resources, use util/wpt_url_test_data.py:
python util/wpt_url_test_data.py download \
--dest_dir tests/conformance/data \
--commit <WPT_URL_COMMIT>
The script downloads the WPT URL JSON resources (including urltestdata, setters, percent-encoding, toascii, and IDNA fixtures), preserves comments, validates schemas, and writes a metadata file with the pinned commit. The scheduled workflow .github/workflows/fetch_test_data.yml runs the same script via the composite action and is keyed on the pinned commit; adjust WPT_URL_COMMIT/WPT_TEST_DATA_PATH in the workflow env if you need a different pin.
Updating the pinned WPT commit
- A helper workflow,
.github/workflows/update_wpt_url.yml, can be triggered manually (or waits for its weekly schedule) to fetch the latesturl/commit from WPT, bump all pins, refresh the fixtures, and open a PR. - To bump manually without the bot:
- Get the latest
url/commit:NEW_COMMIT=$(curl -s https://api.github.com/repos/web-platform-tests/wpt/commits?path=url&per_page=1 | jq -r '.[0].sha') - Set
WPT_URL_COMMITto that value in.github/workflows/main.ymland.github/workflows/fetch_test_data.yml, and update the defaults in.github/workflows/actions/fetch_wpt_url_test_data/action.ymlandutil/wpt_url_test_data.pyto match. - Refresh fixtures:
python util/wpt_url_test_data.py download --dest_dir tests/conformance/data --commit "$NEW_COMMIT" - Commit the workflow and data changes together.
- Get the latest
Building
To build the package distribution (wheel and sdist):
uv build
The artifacts will be generated in the dist/ directory.
Versioning is dynamic and derived from Git tags (e.g., 0.1.0 or 0.1.dev1+...).
Documentation
To build and serve the documentation locally:
# Install docs dependencies
uv sync --group docs
# Serve locally (with live reload)
uv run mkdocs serve
# Build static site
uv run mkdocs build
Documentation is automatically deployed to GitHub Pages when changes are pushed to master.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pywhatwgurl-0.1.1.tar.gz.
File metadata
- Download URL: pywhatwgurl-0.1.1.tar.gz
- Upload date:
- Size: 155.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65c85da35367511c12a4dd87fecaf08aa3ae564259055b37b4ae53429289fcf6
|
|
| MD5 |
ecc1f799dd46921ab25dd86ca3d16a51
|
|
| BLAKE2b-256 |
c0d2ce0fffb9eb66ea2f88d20d7c3841b017d25559b6b617bce566811fe0bb48
|
Provenance
The following attestation bundles were made for pywhatwgurl-0.1.1.tar.gz:
Publisher:
release.yml on pywhatwgurl/pywhatwgurl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pywhatwgurl-0.1.1.tar.gz -
Subject digest:
65c85da35367511c12a4dd87fecaf08aa3ae564259055b37b4ae53429289fcf6 - Sigstore transparency entry: 1224339435
- Sigstore integration time:
-
Permalink:
pywhatwgurl/pywhatwgurl@6cddbd7a9c3b7bb3173b88930dc74351d2030371 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/pywhatwgurl
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@6cddbd7a9c3b7bb3173b88930dc74351d2030371 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file pywhatwgurl-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pywhatwgurl-0.1.1-py3-none-any.whl
- Upload date:
- Size: 26.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d67072d3f702f899e6e5de074e7fa14b5c9c5c85f616c6016c4eef9c94113d5f
|
|
| MD5 |
c39fc2958b4f21fd2800ef87a75a01ac
|
|
| BLAKE2b-256 |
002f6e14535f7532c836c1200ea53785570d3c37128719ff62915d0971e00598
|
Provenance
The following attestation bundles were made for pywhatwgurl-0.1.1-py3-none-any.whl:
Publisher:
release.yml on pywhatwgurl/pywhatwgurl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pywhatwgurl-0.1.1-py3-none-any.whl -
Subject digest:
d67072d3f702f899e6e5de074e7fa14b5c9c5c85f616c6016c4eef9c94113d5f - Sigstore transparency entry: 1224339436
- Sigstore integration time:
-
Permalink:
pywhatwgurl/pywhatwgurl@6cddbd7a9c3b7bb3173b88930dc74351d2030371 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/pywhatwgurl
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@6cddbd7a9c3b7bb3173b88930dc74351d2030371 -
Trigger Event:
workflow_dispatch
-
Statement type: