Skip to main content

Terminology linter for a given subject area text

Project description

termlint

PyPI version License Python versions CI

Terminology linter for projects. termlint extracts term candidates from text and checks them against your glossary or ontology.

Alpha Status

termlint is currently alpha.

Implemented now:

  • rule-based extraction (RuleExtractor / spaCy)
  • C-Value extraction (CValueExtractor)
  • verification: exact, fuzzy
  • JSON reports: verification, ontology_update, quality_gate, extraction
  • glossary tooling: glossary from-report, glossary merge

Current support is intentionally narrow:

  • Python >=3.10
  • spaCy-based extraction is currently supported on Python <3.14
  • officially tested with English and Russian spaCy models
  • other spaCy models may work, but should be treated as experimental in this alpha stage

Quick Start

  1. Install:
# Recommended for CLI usage
pipx install "termlint[base]"

# Alternative: install into a project environment
pip install --pre "termlint[base]"

# Install a spaCy model into the same environment
python -m spacy download en_core_web_sm

For pipx, install the model inside the pipx environment:

pipx runpip termlint install en-core-web-sm
# or for Russian
pipx runpip termlint install ru-core-news-sm
  1. Create a glossary (glossary.json):
[
  { "id": "ml:001", "label": "machine learning", "synonyms": ["ML"] },
  { "id": "ml:002", "label": "artificial intelligence", "synonyms": ["AI"] }
]
  1. Create an input file (input.txt):
Artificial intelligence and machine learning are used in data analytics.
  1. Run verification:
termlint verify input.txt --source glossary.json --verifier fuzzy --threshold 85

This command works without any local pyproject.toml. In the no-config case, termlint uses built-in defaults for the full verification pipeline and English spaCy model settings.

Example output:

Files     ... 100%
✅ input.txt ... 100%
📊 Coverage: 33.3% (2/6)
⚠️  Quality Gate would FAIL in CI mode

Generated reports:

  • reports/verification.json
  • reports/ontology_update.json
  • reports/quality_gate.json

Exit behavior:

  • verify exits 0 on a successful run by default, even if the quality gate would fail in CI mode
  • verify --fail-on-quality-gate exits 1 when quality gates fail

Configuration

Project configuration lives in pyproject.toml under [tool.termlint].

Example:

[tool.termlint.logging]
level = "WARNING"
log_file = "reports/termlint.log"

[tool.termlint.extraction]
extractors = ["rule", "cvalue"]
rules = { model = "en_core_web_sm", auto_download_model = false }
cvalue = { threshold = 0.25, min_freq = 1, min_length = 2, max_length = 4, use_ling_filter = true, model = "en_core_web_sm", auto_download_model = false }

Notes:

  • use termlint -v or termlint -vv for more verbose logs
  • keep auto_download_model = false in CI or reproducible environments
  • glossary sources are JSON arrays of entities with required fields id and label

Minimal valid glossary:

[
  {
    "id": "ml:001",
    "label": "machine learning"
  }
]

Config lookup order:

  1. --config <PATH>
  2. nearest pyproject.toml with [tool.termlint]
  3. user config:
    • $XDG_CONFIG_HOME/termlint/config.toml
    • ~/.config/termlint/config.toml
    • %APPDATA%/termlint/config.toml
    • ~/.termlint/config.toml
  4. built-in defaults

User config files may use either [tool.termlint] or [termlint].

More Docs

License

This project is licensed under the MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

termlint-0.1.0a4.tar.gz (38.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

termlint-0.1.0a4-py3-none-any.whl (55.0 kB view details)

Uploaded Python 3

File details

Details for the file termlint-0.1.0a4.tar.gz.

File metadata

  • Download URL: termlint-0.1.0a4.tar.gz
  • Upload date:
  • Size: 38.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for termlint-0.1.0a4.tar.gz
Algorithm Hash digest
SHA256 b8ffd8557cf2372005d2f14c69aa0b11764ff0e63baa94a84ad186dcbdcf6a0a
MD5 ad47caf38e1249e32592b014d62f72df
BLAKE2b-256 4d30199f33917c10774df0b628bd0ff51a4a0b1a86f8797b5d2f3e6ce7f20e4d

See more details on using hashes here.

File details

Details for the file termlint-0.1.0a4-py3-none-any.whl.

File metadata

  • Download URL: termlint-0.1.0a4-py3-none-any.whl
  • Upload date:
  • Size: 55.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for termlint-0.1.0a4-py3-none-any.whl
Algorithm Hash digest
SHA256 4eaa60bfe232c184aa27182d82ef50d6babbf4a6def5ca78b7b02e703645f9db
MD5 904d705f89af9bc29b362d42a41c41d9
BLAKE2b-256 8ea0d43d463354fd0dd00a7f92841ffd5760b4105fcf2c1132b019db4ba8dd08

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page