Skip to main content

Comprehensive license normalisation with a three-level hierarchy.

Project description

License Normaliser Logo

Comprehensive license normalsation with a three-level hierarchy.

PyPI Version Supported Python versions Build Status Documentation Status MIT

license-normaliser is a comprehensive license normalisation library that maps any license representation (SPDX tokens, URLs, prose descriptions) to a canonical three-level hierarchy.

Features

  • Three-level hierarchy - LicenseFamily → LicenseName → LicenseVersion.

  • Wide format support - SPDX tokens, URLs, prose descriptions.

  • Creative Commons support - Full CC family with versions and IGO variants.

  • Publisher-specific licenses - Springer, Nature, Elsevier, Wiley, ACS, and more.

  • Caching - LRU caching for performance.

  • CLI - Command-line interface for quick normalisation.

Hierarchy

The library uses a three-level hierarchy:

  1. LicenseFamily - broad bucket: "cc", "osi", "copyleft", "publisher-tdm", …

  2. LicenseName - version-free: "cc-by", "cc-by-nc-nd", "mit", "wiley-tdm"

  3. LicenseVersion - fully resolved: "cc-by-3.0", "cc-by-nc-nd-4.0"

Installation

With uv:

uv pip install license-normaliser

Or with pip:

pip install license-normaliser

Quick start

from license_normaliser import normalise_license

v = normalise_license("CC BY-NC-ND 4.0")
str(v)                 # "cc-by-nc-nd-4.0"  ← LicenseVersion
str(v.license)         # "cc-by-nc-nd"      ← LicenseName
str(v.license.family)  # "cc"               ← LicenseFamily

Resolution pipeline (first match wins)

  1. Direct registry lookup (cleaned lowercase key)

  2. Alias table (prose variants, SPDX tokens, mixed-case short-forms)

  3. Exact URL map (http/https, trailing-slash normalised, fragment-aware)

  4. Structural CC URL regex (any creativecommons.org URL not in the map)

  5. Prose keyword scan (full sentences from license documents)

  6. Fallback (key = cleaned string, everything else unknown/None)

CLI usage

Normalise a single license:

license-normaliser normalise "MIT"
# Output: mit

license-normaliser normalise --full "CC BY 4.0"
# Output:
# Key: cc-by-4.0
# URL: https://creativecommons.org/licenses/by/4.0/
# License: cc-by
# Family: cc

Batch normalise multiple licenses:

license-normaliser batch MIT "Apache-2.0" "CC BY 4.0"
# Output:
# MIT: mit
# Apache-2.0: apache-2.0
# CC BY 4.0: cc-by-4.0

Testing

All tests run inside Docker:

make test

To test a specific Python version:

make test-env ENV=py312

License

MIT

Support

For issues, go to GitHub.

Author

Artur Barseghyan <artur.barseghyan@gmail.com>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

license_normaliser-0.1.1.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

license_normaliser-0.1.1-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file license_normaliser-0.1.1.tar.gz.

File metadata

  • Download URL: license_normaliser-0.1.1.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for license_normaliser-0.1.1.tar.gz
Algorithm Hash digest
SHA256 09238c4823d74c829175ca5b452421e98b59f84f656ca8c1695f34f19b47a72d
MD5 d45d6234c216a2456d3bb6e348499b17
BLAKE2b-256 e73c7ee0f7c893906bf6954ff460e2d76fb991e5ce755f76a6f6df892cb9db4f

See more details on using hashes here.

File details

Details for the file license_normaliser-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for license_normaliser-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 faa6f45a6252cff69fbe3f6a16bdbac0f76957bc4e22663421189786fb9a107a
MD5 dd0ffd17c2df61c699a6f26ddc573683
BLAKE2b-256 fc33b071e4a933a5565cf93e9ec9e51bbb5668bee04e5676aa2f4dd9f709b0f3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page