Skip to main content

A traceable LaTeX flattener.

Project description

flachtex

A tool to flatten complex LaTeX-documents, i.e., create a single file out of a complex document structure. Its primary feature is that it remembers from which file (and position within the file) each character came. Additionally, it has some extra commands that allow to explicitly tell flachtex what to do, even in complicated scenarios.

There are many other tools to flatten LaTeX, but I did not find a tool that could handle my dissertation (that uses subimport and has some logic involved) and also was capable of telling you where each word came from. The second part is important for automated spell, grammar, and code checker: For long documents, you do not want to search where this error was made (in general, no auto-fix is possible). flachtex solves this problem but is slower than other tools because of the additional bookkeeping. There is still room to improve the performance, but it is fast enough for me.

Currently, flachtex supports file inclusions of the following form:

% native includes/inputs
\include{path/file.tex}
\input{path/file.tex}

% subimport
\subimport{path}{file}
\subimport*{path}{file}

% manual import
%%FLACHTEX-EXPLICIT-IMPORT[path/to/file]
%%FLACHTEX-SKIP-START
Complex import logic that cannot be parsed by flachtex.
%%FLACHTEX-SKIP-STOP

Installation

flachtex is available via pip: pip install flachtex.

CLI

The tool comes with a simple CLI

usage: flachtex [-h] [--to_json] [--remove_comments] path

flachtex: Traceable LaTeX flattening.

positional arguments:
  path               Path to main.tex

optional arguments:
  -h, --help         show this help message and exit
  --to_json          Return a json.
  --remove_comments  Remove comments.
  --attach           Attach sources to json.

Python

You can also directly use it in Python.

from flachtex.comments import remove_comments
from flachtex import FileFinder, expand_file

# For this example, we provide the files as dictionary. You can skip the part with the
# FileFinder if you are working on your file system.
document = {
    "main.tex":
        r"""
% This is a test document. We skip the common preamble of LaTeX.
\section{Main}
Hello!
\include{./modules/part1.tex}
\input{./modules/part2.tex}
%%FLACHTEX-EXPLICIT-IMPORT[./modules/part3.tex]
%%FLACHTEX-SKIP-START
\compleximportlogic{part3.tex}
%%FLACHTEX-SKIP-STOP
    """,
    "modules/part1.tex": "I am part1!\n",
    "modules/part2.tex": "I am part2!\n",
    "modules/part3.tex": "I am part3!\n",
}

file_finder = FileFinder("/", "main.tex", document)
flat_doc = expand_file("main.tex", file_finder=file_finder)
# you can also use flat_doc.get_origin(pos)

print(remove_comments(flat_doc).to_json())

returns

{'content': '\n\\section{Main}\nHello!\nI am part1!\n\nI am part2!\n\nI am part3!\n\n\n    ', 
  'origins': [{'begin': 0, 'end': 1, 'origin': 'main.tex', 'offset': 0}, 
    {'begin': 1, 'end': 23, 'origin': 'main.tex', 'offset': 66}, 
    {'begin': 23, 'end': 35, 'origin': 'modules/part1.tex', 'offset': 0},
    {'begin': 35, 'end': 36, 'origin': 'main.tex', 'offset': 117},
    {'begin': 36, 'end': 48, 'origin': 'modules/part2.tex', 'offset': 0},
    {'begin': 48, 'end': 49, 'origin': 'main.tex', 'offset': 145}, 
    {'begin': 49, 'end': 61, 'origin': 'modules/part3.tex', 'offset': 0},
    {'begin': 61, 'end': 62, 'origin': 'main.tex', 'offset': 193}, 
    {'begin': 62, 'end': 67, 'origin': 'main.tex', 'offset': 267}]}

(the superfluous linebreaks and spaces are still an issue but not an urgent one)

Path Resolution

flachtex will first try to resolve the inclusion relative to the calling file. If no file is found (also trying with additional ".tex"), it tries the document folder (cwd) and the folder of the root tex-file. Afterwards, it tries the parent directories.

If this is not sufficient, try to use the %%FLACHTEX-EXPLICIT-IMPORT[path/file.tex] option.

Extending the tool

flachtex has a modular structure that allows it to receive additional rules or replace existing ones. You can find the current rules in ./flachtex/rules.py. All current rules are implemented as regular expressions.

It is important that the matches do not overlap. For efficiency, flachtex will first find the matches and only then includes the files. Overlapping matches would need a complex resolution and my result in unexpected output. (It would not be too difficult to add some simple resolution rules instead of simply throwing an exception).

This tool is still work in progress.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flachtex-0.2.5.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flachtex-0.2.5-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file flachtex-0.2.5.tar.gz.

File metadata

  • Download URL: flachtex-0.2.5.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for flachtex-0.2.5.tar.gz
Algorithm Hash digest
SHA256 ba2874ff8040bf83942e21a6c98800ed771453486c81c6a24a9c738b6ec2197d
MD5 ef65393cf618bbedafa5b705fa1c6e56
BLAKE2b-256 8249470f906f26d26a3e46c9034eafdaa4dda4e563a6d563cc5575d994c86b41

See more details on using hashes here.

File details

Details for the file flachtex-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: flachtex-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.8.2 requests/2.27.1 setuptools/58.0.4 requests-toolbelt/0.9.1 tqdm/4.63.0 CPython/3.7.9

File hashes

Hashes for flachtex-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a3390c9f2d258e7d681fefdc9a798b9628d3266a845866d5050635a415f795a1
MD5 ad0cca1afd95871b76690cd8068c5eb5
BLAKE2b-256 43500840f023c4428d056e091ecdc697efb8bb5f15891883e4c5edc774f42152

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page