Skip to main content

pre-commit fixers and linters for handling text files

Project description

texthooks

A collection of pre-commit hooks for handling text files.

In particular, hooks for handling unicode characters which may be undesirable in a repository.

Usage with pre-commit

To use with pre-commit, include this repo and the desired hooks in .pre-commit-config.yaml:

- repo: https://github.com/sirosen/texthooks
  rev: 0.1.0
  hooks:
    - id: fix-smartquotes
    - id: fix-ligatures

Standalone Usage

Each hook is usable as a CLI script. Simply

pip install texthooks

and then invoke, e.g.

fix-smartquotes FILENAME

Supported Hooks

fix-smartquotes

This fixes copy-paste from some applications which replace double-quotes with curly quotes. It does not convert corner brackets, braile quotation marks, or angle quotation marks. Those characters are not typically the result of copy-paste errors, so they are allowed.

Low quotation marks vary in usage and meaning by language, and some languages use quotation marks which are facing "outwards" (opposite facing from english). For the most part, these and exotic characters (double-prime quotes) are ignored.

In files with the offending marks, they are replaced and the run is marked as failed.

Overriding Quotation Characters

Two options are available for specifying exactly which characters will be replaced. For ease of use, they are specified as hex-encoded unicode codepoints.

Suppose you wanted to avoid replacing the "Heavy single comma quotation mark ornament" (275C) and the "Heavy single turned comma quotation mark ornament" (275B) characters. You could override the single quote codepoints as follows:

- repo: https://github.com/sirosen/texthooks
  rev: 0.1.0
  hooks:
    - id: fix-smartquotes
      # replace default single quote chars with this set:
      # apostrophe, fullwidth apostrophe, left single quote, single high
      # reversed-9 quote, right single quote
      args: ["--single-quote-codepoints", "0027,FF07,2018,201B,2019"]

fix-ligatures

Automatically find and replace ligature characters with their ascii equivalents.

This replaces liguatures which may be created by programs like LaTeX for presentation with their strictly-equivalent ASCII counterparts. For example, fi and ff may be ligature-ized.

This hook converts these back into ASCII so that tools like grep will behave as expected.

forbid-bidi-controls

This is checker which forbids the use of unicode bidirectional text control characters.

These are directional formatting characters which can be used to construct text with unexpected or unclear semantics. For example, in programming languages which allow bidirectional text in statements, True = ייִדיש can be written with right-to-left reversal to mean that the variable ייִדיש is assigned a value of True.

CHANGELOG

0.2.2

  • Fix a bug in CLI argument handling for all hooks

0.2.1

  • Fix a typo in forbid-bidi-controls entrypoint

0.2.0

  • Add the forbid-bidi-controls hook
  • Adjust the handling of file encodings. Files will be read with UTF-8 encoding by default in most cases.

0.1.0

  • Initial release with fix-ligatures and fix-smartquotes hooks

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

texthooks-0.2.2.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

texthooks-0.2.2-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file texthooks-0.2.2.tar.gz.

File metadata

  • Download URL: texthooks-0.2.2.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for texthooks-0.2.2.tar.gz
Algorithm Hash digest
SHA256 d3bc0e1e29f32ede9678476e6fd8062d54c47433410eb2ffb6f6a48d90997584
MD5 abeb4a3ff44c3fa544f0129abb4563a4
BLAKE2b-256 a0d927dc85926fa16515b1ccc333a8683226dbb52aa06a3e2326c175ab99cf47

See more details on using hashes here.

File details

Details for the file texthooks-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: texthooks-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for texthooks-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 99b7ba4f2c37ce92bd1b3f1613da38c329ce79b603d5cd23802f5e0956a6e83c
MD5 28a3a69704bc68e2b11189db7901658c
BLAKE2b-256 db74a64209259ccc13f0fa4487d01f32490388fcd927cd4a217c14f868931240

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page