Comprehensive license normalization with a three-level hierarchy.
Project description
Comprehensive license normalization with a three-level hierarchy.
license-normaliser is a comprehensive license normalization library that maps any license representation (SPDX tokens, URLs, prose descriptions) to a canonical three-level hierarchy.
Features
Three-level hierarchy - LicenseFamily → LicenseName → LicenseVersion
Wide format support - SPDX tokens, URLs, prose descriptions
Creative Commons support - Full CC family with versions and IGO variants
Publisher-specific licenses - Elsevier, Wiley, Springer, ACS, and more
Caching - LRU caching for performance
CLI - Command-line interface for quick normalization
Hierarchy
The library uses a three-level hierarchy:
LicenseFamily - broad bucket: "cc", "osi", "copyleft", "publisher-tdm", …
LicenseName - version-free: "cc-by", "cc-by-nc-nd", "mit", "wiley-tdm"
LicenseVersion - fully resolved: "cc-by-3.0", "cc-by-nc-nd-4.0"
Installation
With uv:
uv pip install license-normaliser
Or with pip:
pip install license-normaliser
Quick start
from license_normaliser import normalise_license
v = normalise_license("CC BY-NC-ND 4.0")
str(v) # "cc-by-nc-nd-4.0" ← LicenseVersion
str(v.license) # "cc-by-nc-nd" ← LicenseName
str(v.license.family) # "cc" ← LicenseFamily
Resolution pipeline (first match wins)
Direct registry lookup (cleaned lowercase key)
Alias table (prose variants, SPDX tokens, mixed-case short-forms)
Exact URL map (http/https, trailing-slash normalised, fragment-aware)
Structural CC URL regex (any creativecommons.org URL not in the map)
Prose keyword scan (full sentences from license documents)
Fallback (key = cleaned string, everything else unknown/None)
CLI usage
Normalize a single license:
license-normaliser normalise "MIT"
# Output: mit
license-normaliser normalise --full "CC BY 4.0"
# Output:
# Key: cc-by-4.0
# URL: https://creativecommons.org/licenses/by/4.0/
# License: cc-by
# Family: cc
Batch normalize multiple licenses:
license-normaliser batch MIT "Apache-2.0" "CC BY 4.0"
# Output:
# MIT: mit
# Apache-2.0: apache-2.0
# CC BY 4.0: cc-by-4.0
Testing
All tests run inside Docker to prevent accidental side effects:
make test
To test a specific Python version:
make test-env ENV=py312
License
MIT
Support
For issues, go to GitHub.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file license_normaliser-0.1.tar.gz.
File metadata
- Download URL: license_normaliser-0.1.tar.gz
- Upload date:
- Size: 20.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
884dcc73a947db50d22136a770e08213f7d0c5d9d14dd1d126946db87b99ae33
|
|
| MD5 |
bf6ce074e6fdf9216ad60d7e877638a1
|
|
| BLAKE2b-256 |
51f0f11ebf802ef4d029f2191d082dbb5c4e2379b820898db9c8b75d45794baf
|
File details
Details for the file license_normaliser-0.1-py3-none-any.whl.
File metadata
- Download URL: license_normaliser-0.1-py3-none-any.whl
- Upload date:
- Size: 20.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41721b75d0e8542eb3584807f897bc10475f0c8bd596ec908f83472acc9319b1
|
|
| MD5 |
8312c817c7c6c20573568cd01b38255a
|
|
| BLAKE2b-256 |
263317b591cd4a01e84424ce07aadab6e83165222f5c4989ee3b5699cc8b6159
|