Comprehensive license normalisation with a three-level hierarchy.
Project description
Comprehensive license normalsation with a three-level hierarchy.
license-normaliser is a comprehensive license normalisation library that maps any license representation (SPDX tokens, URLs, prose descriptions) to a canonical three-level hierarchy.
Features
Three-level hierarchy - LicenseFamily → LicenseName → LicenseVersion.
Wide format support - SPDX tokens, URLs, prose descriptions.
Creative Commons support - Full CC family with versions and IGO variants.
Publisher-specific licenses - Springer, Nature, Elsevier, Wiley, ACS, and more.
File-driven data - Add aliases, URLs, and patterns by editing JSON files. No Python code changes required for new synonyms.
Pluggable data sources - Drop in a new DataSource class to ingest any external license registry automatically.
Strict mode - Raise LicenseNotFoundError instead of silently returning "unknown".
Caching - LRU caching for performance.
CLI - Command-line interface with --strict support.
Hierarchy
The library uses a three-level hierarchy:
LicenseFamily - broad bucket: "cc", "osi", "copyleft", "publisher-tdm", …
LicenseName - version-free: "cc-by", "cc-by-nc-nd", "mit", "wiley-tdm"
LicenseVersion - fully resolved: "cc-by-3.0", "cc-by-nc-nd-4.0"
Installation
With uv:
uv pip install license-normaliser
Or with pip:
pip install license-normaliser
Quick start
from license_normaliser import normalise_license
v = normalise_license("CC BY-NC-ND 4.0")
str(v) # "cc-by-nc-nd-4.0" ← LicenseVersion
str(v.license) # "cc-by-nc-nd" ← LicenseName
str(v.license.family) # "cc" ← LicenseFamily
Strict mode
By default, unresolvable inputs return an "unknown" result. Pass strict=True to raise LicenseNotFoundError instead:
from license_normaliser import normalise_license
from license_normaliser.exceptions import LicenseNotFoundError
# Silent fallback (default)
v = normalise_license("some-unknown-string")
v.family.key # "unknown"
# Strict: raises on unresolvable input
try:
v = normalise_license("some-unknown-string", strict=True)
except LicenseNotFoundError as exc:
print(exc.raw) # original input
print(exc.cleaned) # cleaned form that failed lookup
Batch normalisation
from license_normaliser import normalise_licenses
results = normalise_licenses(["MIT", "Apache-2.0", "CC BY 4.0"])
for r in results:
print(r.key)
# Strict batch - raises on first unresolvable
results = normalise_licenses(["MIT", "Apache-2.0"], strict=True)
Update data sources (CLI)
license-normaliser update-data --force
# Fetches fresh SPDX + OpenDefinition JSONs into src/license_normaliser/data/
Integration tests (public API only)
All integration tests live in src/license_normaliser/tests/test_integration.py and only import the public API.
CLI usage
Normalise a single license:
license-normaliser normalise "MIT"
# Output: mit
license-normaliser normalise --full "CC BY 4.0"
# Output:
# Key: cc-by-4.0
# URL: https://creativecommons.org/licenses/by/4.0/
# License: cc-by
# Family: cc
license-normaliser normalise --strict "totally-unknown"
# Exits with code 1 and prints an error
Batch normalise:
license-normaliser batch MIT "Apache-2.0" "CC BY 4.0"
license-normaliser batch --strict MIT "Apache-2.0"
Exceptions
from license_normaliser.exceptions import (
LicenseNormaliserError, # base class
LicenseNotFoundError, # raised by strict mode
)
Testing
All tests run inside Docker:
make test
To test a specific Python version:
make test-env ENV=py312
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file license_normaliser-0.2.tar.gz.
File metadata
- Download URL: license_normaliser-0.2.tar.gz
- Upload date:
- Size: 162.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db2a0a8675e59622e5d2e99226950f2a6d57fbb9e49161ce1a010773415a923f
|
|
| MD5 |
c12f3c4c04dc9b36be38cc40cfd5f4be
|
|
| BLAKE2b-256 |
6e956140165c65d77716a8df8eabdf686380b8c9f6381dc20fc9f8dcc8b651f8
|
File details
Details for the file license_normaliser-0.2-py3-none-any.whl.
File metadata
- Download URL: license_normaliser-0.2-py3-none-any.whl
- Upload date:
- Size: 173.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
580775053f5e7781203e529dae60b627f5a07d0690778d0de6a76117c44a5a4d
|
|
| MD5 |
63956353077ad35dcfb34f6f47f708ad
|
|
| BLAKE2b-256 |
b0008fe84bcff5d38253871de67a14207b361846bb4c15df2fd4247f37cd2124
|