Skip to main content

Get the SPDX License ID from license text

Project description

LicenseID

A portable SPDX License ID matcher.

licenseid takes license text as input and identifies the closest matched SPDX License ID using a hybrid search strategy (SQLite FTS5 trigram + RapidFuzz ranking + optional Java validation).

Features

  • Hybrid strategy:
    • Tier 1: Broad recall using SQLite FTS5 with trigram tokenization.
    • Tier 2: Precision ranking using RapidFuzz (token set ratio) + Popularity weighting.
    • Tier 3: Optional final validation via tools-java if available.
  • Unix philosophy: Parseable CLI output.

Installation

Install with pipx:

pipx install licenseid

Or using uv:

uv tool install licenseid

Usage

1. Update the license database

Before matching, you need to build the local license index:

licenseid update

2. Match a license

Match text from a file:

licenseid match LICENSE.txt

Match with Java validation enabled:

licenseid match LICENSE.txt --java

Match with popularity tie-breaker enabled:

licenseid match LICENSE.txt --pop

The tie-breaker is triggered only when candidate similarity scores differ by less than 0.02%.

3. Output formats

Default (Unix-friendly):

LICENSE_ID=Apache-2.0 SCORE=0.9850

JSON:

licenseid match LICENSE.txt --json

Configuration

  • SPDX_TOOLS_JAR: Path to the tools-java jar for Tier 3 validation.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

licenseid-0.1.0.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

licenseid-0.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file licenseid-0.1.0.tar.gz.

File metadata

  • Download URL: licenseid-0.1.0.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for licenseid-0.1.0.tar.gz
Algorithm Hash digest
SHA256 29889235d3acc6510531573c46bf292e07d6d18856e1f31c8346e698710660ee
MD5 9aa1b223d2e433c85ad556e76c4b1c80
BLAKE2b-256 0e92506a936de17c9acc1cf28537ac3583512feb2503941bc2ae35ca582d7ec9

See more details on using hashes here.

Provenance

The following attestation bundles were made for licenseid-0.1.0.tar.gz:

Publisher: pypi-publish.yml on bact/licenseid

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file licenseid-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: licenseid-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for licenseid-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8a0b584a31c41bbe47b76d6a6c482679b2fb15ed9944781b1ad5ca7ccef543f3
MD5 210cfee05ba6f8710fc9ddaf28191dac
BLAKE2b-256 087ff9ca3f223ec06b0924f034e0c4e5115c806fee77c2ba1768cac5614f53a6

See more details on using hashes here.

Provenance

The following attestation bundles were made for licenseid-0.1.0-py3-none-any.whl:

Publisher: pypi-publish.yml on bact/licenseid

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page