Skip to main content

A flexible LaTeX filter

Project description

Tex2txt: a flexible LaTeX filter

This is a Python script or module for the extraction of plain text from LaTeX documents. Due to the following characteristics, it may be integrated with a proofreading software:

  • tracking of line numbers or character positions during text manipulations,
  • simple inclusion of own LaTeX macros and environments with tailored treatment,
  • careful conservation of text flows,
  • detection of trailing interpunction in equations,
  • proper handling of nestable elements like {} braces.

A more complete description is available at the Github page.


This is\footnote{A footnote may be set
in \textcolor{red}{redx colour.}}
is the main text.

The example script will extract plain text, call LanguageTool, and correct position numbers in its messages. Here is the output.

1.) Line [1], column [6], Rule ID: ENGLISH_WORD_REPEAT_RULE
Message: Possible typo: you repeated a word
Suggestion: is
This is is the main text.    A footnote may be set in r...

2.) Line [2], column [20], Rule ID: MORFOLOGIK_RULE_EN_GB
Message: Possible spelling mistake found
Suggestion: red; Rex; reds; redo; Red; Rede; redox; red x
...s the main text.    A footnote may be set in redx colour. 

Project details

Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for tex2txt, version 1.6.5
Filename, size File type Python version Upload date Hashes
Filename, size tex2txt-1.6.5-py3-none-any.whl (36.0 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size tex2txt-1.6.5.tar.gz (23.8 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page