pluggable command-line tool for validating the formatting and orthography of text files
Project description
text-validator
pluggable command-line tool for validating the formatting and orthography of text files
You config your validator plugins with a TOML file like:
["text_validator.plugins.whitespace"]
CHECK_CRLF = true
CHECK_TABS = true
CHECK_TRAILING_WHITESPACE = true
CHECK_NO_EOF_NEWLINE = true
["text_validator.plugins.unicode"]
CONFIRM_UTF_8_NFC = true
["text_validator.plugins.ref_line_format"]
REF_REGEX = "(\\d+|EP|SB)\\.\\d+(\\.\\d+)?$" # example from AF
["text_validator.plugins.characters"]
REPLACE_CHARS = [
# bad character, suggested replacement
["\u02BC", "\u2019"],
["\u1FBF", "\u2019"],
["\u037E", "\u003B"],
["\u0387", "\u00B7"],
["\u0374", "\u02B9"],
["\u03D5", "\u03C6"],
["\u03D1", "\u03B8"],
]
and they'll validate the texts you give it:
tests/test_0001.txt:1:line ends with CRLF
tests/test_0001.txt:2:line ends with CRLF
tests/test_0002.txt:1:no newline at end of file
tests/test_0003.txt:1:line contains a tab
tests/test_0004.txt:1:trailing whitespace
tests/test_0006.txt:1:not NFC
tests/test_0007.txt:2:BLANK LINE
tests/test_0008.txt:1:BAD WHITESPACE
tests/test_0008.txt:2:BAD WHITESPACE
tests/test_0009.txt:4:BAD REFERENCE FORM
tests/test_0009.txt:5:BAD REFERENCE FORM
tests/test_0010.txt:2:29:bad U+02BC; consider replacing with U+2019
tests/test_0010.txt:3:29:bad U+1FBF; consider replacing with U+2019
To install:
pip install text-validator
Then you can either run from the command line:
validate-text tests/config_004.toml tests/test_0007.txt tests/test_0008.txt tests/test_0009.txt
or programmatically from Python, either with the helper function validate:
from text_validator.main import validate
validate("tests/config_003.toml", ["tests/test_0005.txt", "tests/test_0006.txt"])
or by working directly with a Suite instance:
from text_validator.base import Suite
suite = Suite()
suite.load_toml("tests/config_002.toml")
suite.validate_files(["tests/test_0005.txt", "tests/test_0006.txt"])
Also see:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file text-validator-0.3.tar.gz.
File metadata
- Download URL: text-validator-0.3.tar.gz
- Upload date:
- Size: 5.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.8.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
282901c6143e1dc90e534e5d951d35890e84ad7f9ef38cd551466c32a340fae4
|
|
| MD5 |
fce4fd78dd1068262e5d53d2f3f46a8a
|
|
| BLAKE2b-256 |
abd76a8f9b668d79ec07c4f750dcac7682fcd1ad228428650fb402094ac84fc8
|
File details
Details for the file text_validator-0.3-py3-none-any.whl.
File metadata
- Download URL: text_validator-0.3-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.8.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad1ef9f051afd69a286eca1c95c269ef418238f57eb500e8a134054d6074c481
|
|
| MD5 |
210571ec19fdc10c23bd5f9c6aa41842
|
|
| BLAKE2b-256 |
6c8eaab33044f0b28429ffa7d0acaed4726cf9a150d9d9addba31f44c8fa9015
|