Skip to main content

Terminology linter for a given subject area text

Project description

termlint

PyPI version License Python versions CI

Terminology linter for projects. termlint extracts term candidates from text and checks them against your glossary or ontology.

Alpha Status

termlint is currently alpha.

Implemented now:

  • rule-based extraction (RuleExtractor / spaCy)
  • C-Value extraction (CValueExtractor)
  • verification: exact, fuzzy
  • JSON reports: verification, ontology_update, quality_gate, extraction
  • glossary tooling: glossary from-report, glossary merge

Current support is intentionally narrow:

  • Python >=3.10
  • spaCy-based extraction is currently supported on Python <3.14
  • officially tested with English and Russian spaCy models
  • other spaCy models may work, but should be treated as experimental in this alpha stage

Quick Start

  1. Install:
# Recommended for CLI usage
pipx install "termlint[base]"

# Alternative: install into a project environment
pip install --pre "termlint[base]"

# Install a spaCy model into the same environment
python -m spacy download en_core_web_sm

For pipx, install the model inside the pipx environment:

pipx runpip termlint install en-core-web-sm
# or for Russian
pipx runpip termlint install ru-core-news-sm
  1. Create a glossary (glossary.json):
[
  { "id": "ml:001", "label": "machine learning", "synonyms": ["ML"] },
  { "id": "ml:002", "label": "artificial intelligence", "synonyms": ["AI"] }
]
  1. Create an input file (input.txt):
Artificial intelligence and machine learning are used in data analytics.
  1. Run verification:
termlint verify input.txt --source glossary.json --verifier fuzzy --threshold 85

Example output:

Files     ... 100%
✅ input.txt ... 100%
📊 Coverage: 33.3% (2/6)
⚠️  Quality Gate would FAIL in CI mode

Generated reports:

  • reports/verification.json
  • reports/ontology_update.json
  • reports/quality_gate.json

Exit behavior:

  • verify exits 0 on a successful run by default, even if the quality gate would fail in CI mode
  • verify --fail-on-quality-gate exits 1 when quality gates fail

Configuration

Project configuration lives in pyproject.toml under [tool.termlint].

Example:

[tool.termlint.logging]
level = "WARNING"
log_file = "reports/termlint.log"

[tool.termlint.extraction]
extractors = ["rule", "cvalue"]
rules = { model = "en_core_web_sm", auto_download_model = false }
cvalue = { threshold = 0.25, min_freq = 1, min_length = 2, max_length = 4, use_ling_filter = true, model = "en_core_web_sm", auto_download_model = false }

Notes:

  • use termlint -v or termlint -vv for more verbose logs
  • keep auto_download_model = false in CI or reproducible environments
  • glossary sources are JSON arrays of entities with required fields id and label

Minimal valid glossary:

[
  {
    "id": "ml:001",
    "label": "machine learning"
  }
]

Config lookup order:

  1. --config <PATH>
  2. nearest pyproject.toml with [tool.termlint]
  3. user config:
    • $XDG_CONFIG_HOME/termlint/config.toml
    • ~/.config/termlint/config.toml
    • %APPDATA%/termlint/config.toml
    • ~/.termlint/config.toml
  4. built-in defaults

User config files may use either [tool.termlint] or [termlint].

More Docs

License

This project is licensed under the MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

termlint-0.1.0a3.tar.gz (38.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

termlint-0.1.0a3-py3-none-any.whl (54.9 kB view details)

Uploaded Python 3

File details

Details for the file termlint-0.1.0a3.tar.gz.

File metadata

  • Download URL: termlint-0.1.0a3.tar.gz
  • Upload date:
  • Size: 38.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for termlint-0.1.0a3.tar.gz
Algorithm Hash digest
SHA256 97643a8fdcd3ea8394f53222965d38173bff9659c459991777bbfb1e52318b55
MD5 4a83f7aae137fd122c84589febde7c9d
BLAKE2b-256 a38568422c173a150f252ea7386c4b4540d66b83119b4f049dbd79c27ca86795

See more details on using hashes here.

File details

Details for the file termlint-0.1.0a3-py3-none-any.whl.

File metadata

  • Download URL: termlint-0.1.0a3-py3-none-any.whl
  • Upload date:
  • Size: 54.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for termlint-0.1.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 d729a882449d9c62097f47ba9bed1872248c0e7bde8a2a2b62a73bbe9d29eb3e
MD5 33b4ef4dd4add8d342d7e278ff2d7065
BLAKE2b-256 77b0e7e688ec46da722f44fd0e87863f3be5d5478537fe4cf2ebccc93853d83a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page