Skip to main content

Checks grammar using LanguageTool.

Project description

language_tool_python: Python wrapper for LanguageTool

language tool python on pypi Documentation Status Test with PyTest Coverage Status Downloads License: GPL v3 Contributions Welcome

language_tool_python is a Python interface/wrapper to LanguageTool, an open-source grammar, style, and spell checker.

It can:

  • run a local LanguageTool Java server,
  • call LanguageTool public API,
  • call your own remote LanguageTool server,
  • be used from Python code and from a CLI.

Default local download target: LanguageTool 6.8.

Documentation

Requirements

  • Python >=3.9 (tested up to 3.14)
  • Java (to run local LanguageTool server):
    • LanguageTool < 6.6: Java >=9
    • LanguageTool >= 6.6 (default): Java >=17

Installation

pip install --upgrade language_tool_python

Quick Start

Local server

import language_tool_python

with language_tool_python.LanguageTool("en-US") as tool:
    text = "A sentence with a error in the Hitchhiker's Guide tot he Galaxy"
    matches = tool.check(text)
    print(matches)
    print(tool.correct(text))

Public LanguageTool API

import language_tool_python

with language_tool_python.LanguageToolPublicAPI("es") as tool:
    matches = tool.check("Se a hecho un esfuerzo.")
    print(matches)

Your own remote LanguageTool server

import language_tool_python

with language_tool_python.LanguageTool(
    "en-US",
    remote_server="https://your-lt-server.example.com",
) as tool:
    print(tool.check("This are bad."))

Constructor Parameters Worth Knowing

language_tool_download_version (local server only)

Use this parameter to force which LanguageTool package is used when running a local server.

import language_tool_python

with language_tool_python.LanguageTool(
    "en-US",
    language_tool_download_version="6.7",
) as tool:
    print(tool.check("This are bad."))

Accepted formats:

  • latest: latest snapshot available from the snapshot server
  • YYYYMMDD: snapshot by date (example: 20260201)
  • X.Y: release version (default: 6.8. Examples: 6.7, 4.0)

Notes:

  • Only relevant when using a local server (no remote_server).
  • Versions below 4.0 are not supported.

proxies (remote server only)

Use this parameter to pass proxy settings to requests when calling a remote LanguageTool server.

import language_tool_python

with language_tool_python.LanguageTool(
    "en-US",
    remote_server="https://your-lt-server.example.com",
    proxies={
        "http": "http://proxy.example.com:8080",
        "https": "http://proxy.example.com:8080",
    },
) as tool:
    print(tool.check("This are bad."))

Notes:

  • proxies works only with remote_server.
  • Passing proxies without remote_server raises ValueError.

Core Python API

Check text

matches = tool.check("This is noot okay.")

Each item is a Match object with these fields:

  • rule_id
  • message
  • replacements
  • offset_in_context, context, offset, error_length
  • category, rule_issue_type
  • sentence

Auto-correct

corrected = tool.correct("This is noot okay.")
# Uses first suggestion for each match

Apply only selected matches

text = "There is a bok on the table."
matches = tool.check(text)

# Keep a specific suggestion for first match
matches[0].select_replacement(2)

patched = language_tool_python.utils.correct(text, matches)

Check only parts matching a regex

matches = tool.check_matching_regions(
    'He said "I has a problem" but she replied "It are fine".',
    r'"[^"]*"',
)

Classify result quality

from language_tool_python.utils import classify_matches

status = classify_matches(tool.check("This is a cats."))
# TextStatus.CORRECT / TextStatus.FAULTY / TextStatus.GARBAGE

Rule and Language Controls

You can tune checks per instance:

tool.language = "en" # Can also be set from constructor (`LanguageTool("en")`)
tool.mother_tongue = "fr" # Can also be set from constructor (`LanguageTool("en", mother_tongue="fr")`)

tool.disabled_rules.update({"MORFOLOGIK_RULE_EN_US"})
tool.enabled_rules.update({"EN_A_VS_AN"})
tool.enabled_rules_only = False

tool.disabled_categories.update({"CASING"})
tool.enabled_categories.update({"GRAMMAR"})

tool.preferred_variants.update({"en-GB"})
tool.picky = True

Spellchecking control:

tool.disable_spellchecking()
tool.enable_spellchecking()

# Equivalent to:
tool.disabled_categories.update({"TYPOS"})
tool.disabled_categories.difference_update({"TYPOS"})

Custom Spellings

You can register domain-specific words:

with language_tool_python.LanguageTool(
    "en-US",
    new_spellings=["my_product_name", "my_team_term"],
    new_spellings_persist=False,
) as tool:
    print(tool.check("my_product_name is released"))
  • new_spellings_persist=True (default): keeps words in the local LT spelling file.
  • new_spellings_persist=False: session-only, words are removed on close().

Local Server Configuration (config=)

For local servers only, pass a config dictionary. Example:

with language_tool_python.LanguageTool(
    "en-US",
    config={
        "cacheSize": 1000,
        "pipelineCaching": True,
        "maxTextLength": 50000,
    },
) as tool:
    print(tool.check("Text to inspect"))

Supported keys:

  • maxTextLength, maxTextHardLength, maxCheckTimeMillis
  • maxErrorsPerWordRate, maxSpellingSuggestions, maxCheckThreads
  • cacheSize, cacheTTLSeconds
  • requestLimit, requestLimitInBytes, timeoutRequestLimit, requestLimitPeriodInSeconds
  • languageModel, fasttextModel, fasttextBinary
  • maxWorkQueueSize, rulesFile, blockedReferrers
  • premiumOnly, disabledRuleIds
  • pipelineCaching, maxPipelinePoolSize, pipelineExpireTimeInSeconds, pipelinePrewarming
  • trustXForwardForHeader, suggestionsEnabled
  • spellcheck-only language keys:
    • lang-<code>
    • lang-<code>-dictPath

Notes:

  • remote_server and config cannot be used together.
  • proxies can only be used with remote_server.

CLI

Entry point:

language_tool_python [OPTIONS] FILE [FILE ...]

Use - as file to read from stdin.

Examples:

# Check a file
language_tool_python -l en-US README.md

# Check stdin
echo "This are bad." | language_tool_python -l en-US -

# Auto-apply suggestions
language_tool_python -l en-US --apply input.txt

# Use only selected rules
language_tool_python -l en-US --enabled-only --enable MORFOLOGIK_RULE_EN_US input.txt

# Use remote LT server
language_tool_python -l en-US --remote-host 127.0.0.1 --remote-port 8081 input.txt

Main options:

  • -l, --language CODE
  • -m, --mother-tongue CODE
  • -d, --disable RULES
  • -e, --enable RULES
  • --enabled-only
  • -p, --picky
  • -a, --apply
  • -s, --spell-check-off
  • --ignore-lines REGEX
  • --remote-host HOST, --remote-port PORT
  • -c, --encoding
  • --verbose
  • --version

Exit codes:

  • 0: no issues
  • 2: issues found

Environment Variables

  • LTP_PATH: directory used to store downloaded LanguageTool packages.
    • default: ~/.cache/language_tool_python/
  • LTP_JAR_DIR_PATH: use an existing local LanguageTool directory (skip download).
  • LTP_DOWNLOAD_HOST_SNAPSHOT: override snapshot download host.
    • default: https://internal1.languagetool.org/snapshots/
  • LTP_DOWNLOAD_HOST_RELEASE: override release download host.
    • default: https://languagetool.org/download/
  • LTP_DOWNLOAD_HOST_ARCHIVE: override archive download host.
    • default: https://languagetool.org/download/archive/
  • LTP_DOWNLOAD_SHA256_<VERSION>: version-specific expected SHA-256 for the downloaded LanguageTool archive, for example LTP_DOWNLOAD_SHA256_6_9_SNAPSHOT.
  • LTP_DOWNLOAD_SHA256: fallback expected SHA-256 for the downloaded LanguageTool archive.
  • LTP_BYPASS_VERIFIED_DOWNLOADS: set to true to skip SHA-256 verification.
  • LTP_MAX_DOWNLOAD_BYTES: maximum downloaded ZIP size in bytes.
    • default: 536870912 (512 MiB)
  • LTP_SAFE_ZIP_MAX_ARCHIVE_BYTES: maximum total compressed member size in bytes.
    • default: 536870912 (512 MiB)
  • LTP_SAFE_ZIP_MAX_EXTRACTED_BYTES: maximum total extracted size in bytes.
    • default: 805306368 (768 MiB)
  • LTP_SAFE_ZIP_MAX_MEMBERS: maximum ZIP member count.
    • default: 5000
  • LTP_SAFE_ZIP_MAX_MEMBER_EXTRACTED_BYTES: maximum extracted size for a single ZIP member in bytes.
    • default: 134217728 (128 MiB)
  • LTP_SAFE_ZIP_MAX_MEMBER_COMPRESSION_RATIO: maximum compression ratio for a single ZIP member.
    • default: 100.0
  • LTP_SAFE_ZIP_MAX_TOTAL_COMPRESSION_RATIO: maximum compression ratio for the whole ZIP archive.
    • default: 10.0

Downloaded zips are verified with SHA-256 when a checksum is available. Checksums are resolved in this order:

  1. LTP_DOWNLOAD_SHA256_<VERSION>, where non-alphanumeric characters in the version are replaced with _ and the name is uppercased.
  2. LTP_DOWNLOAD_SHA256.
  3. The bundled language_tool_python/integrity.toml manifest.

The bundled manifest covers release/archive downloads. Snapshots are not stable, so provide LTP_DOWNLOAD_SHA256_<VERSION> or LTP_DOWNLOAD_SHA256 if you want to verify a snapshot. If no checksum is available, the download proceeds without SHA-256 verification.

Example:

export LTP_PATH=/path/to/cache
export LTP_JAR_DIR_PATH=/path/to/LanguageTool-6.8
export LTP_DOWNLOAD_SHA256_6_8=<sha256>
# export LTP_BYPASS_VERIFIED_DOWNLOADS=true

Resource Management

When using a local server, prefer a context manager or explicit close():

with language_tool_python.LanguageTool("en-US") as tool:
    ...

# or
tool = language_tool_python.LanguageTool("en-US")
...
tool.close()

Client/Server Pattern

You can run LT on one process/host and connect from another client:

# Server side
server_tool = language_tool_python.LanguageTool("en-US")

# Client side
client_tool = language_tool_python.LanguageTool(
    "en-US",
    remote_server=f"http://127.0.0.1:{server_tool.port}",
)

Error Types

Main exceptions in language_tool_python.exceptions:

  • LanguageToolError
    • ServerError
    • JavaError
    • PathError
    • RateLimitError

Development

# Install dev dependencies
uv sync --group tests --group docs --group types

# Lint / format / types
uvx ruff@0.15.12 check .
uvx ruff@0.15.12 format .
uvx mypy@2.0.0

# Tests
pytest

License

GPL-3.0-only. See LICENSE.

Acknowledgements

This project is based on the original language-check project: https://github.com/myint/language-check/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

language_tool_python-3.4.0.tar.gz (62.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

language_tool_python-3.4.0-py3-none-any.whl (63.2 kB view details)

Uploaded Python 3

File details

Details for the file language_tool_python-3.4.0.tar.gz.

File metadata

File hashes

Hashes for language_tool_python-3.4.0.tar.gz
Algorithm Hash digest
SHA256 8564d45970813bfc4c58c9d2d693571bb9678ebeef69ec5ef04ff1914053293d
MD5 f9c3083e862f65c0a79d4c61b90628d1
BLAKE2b-256 6bc7b02e31143e8bdf2c9d52f37f1a28ff721a2af9457b8d9f8b0fe328f4525f

See more details on using hashes here.

File details

Details for the file language_tool_python-3.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for language_tool_python-3.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 65a603e4d35e61a3a617d6dc9ad0ee1744c126aae93d602292eac57b83b8072e
MD5 a6dd6dd06c74315d575c1d1f5855c70a
BLAKE2b-256 1c15495cc7417dc28de56b2fe45338a6442f9e6c5173a27758f29b8f009eb93d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page