Skip to main content

Convert LaTeX documents to accessible HTML with PDF output

Project description

latexport

A workflow for converting LaTeX documents to accessible HTML with PDF output.

Overview

This project converts .tex files into web-ready HTML pages using LaTeXML, while also generating PDF versions via pdflatex. The HTML output is customised with additional CSS, JavaScript, and accessibility enhancements.

Project Structure

latexport/
├── static/            # Shared CSS/JS assets (source of truth)
│   ├── css/
│   │   └── custom.css
│   └── js/
│       ├── custom.js
│       └── mathjax-config.js
├── latexml/           # Custom LaTeXML bindings (.ltxml); all loaded automatically
│   ├── amsmath-compat.ltxml
│   └── emph-in-math.ltxml
├── output/            # Generated output (seeded from static/ on each run)
│   ├── css/           # Copied from static/css/
│   ├── js/            # Copied from static/js/
│   ├── index.html     # Generated by latexport-index
│   └── {document}/    # Per-document output
│       ├── index.html
│       └── {document}.pdf
├── templates/         # HTML templates
├── main.py            # Main processing script
├── create_main_index.py  # Index page generator
├── embed_assets.py    # Self-contained HTML bundler
└── config.py          # Configuration settings

Prerequisites

Python 3.12+ and uv

Install uv, which manages the Python version and dependencies:

curl -LsSf https://astral.sh/uv/install.sh | sh

LaTeXML

LaTeXML converts .tex files to HTML5.

macOS:

brew install latexml

Ubuntu / Debian:

sudo apt install latexml

Other: see the LaTeXML installation docs.

TeX distribution (pdflatex)

A TeX distribution provides pdflatex, used to produce PDF output.

macOS:

brew install --cask mactex-no-gui

Ubuntu / Debian:

sudo apt install texlive-latex-base

Already have TeX Live? Install only pdflatex via tlmgr:

tlmgr install pdftex

bibtex is included with most TeX distributions. For biber (used with biblatex):

tlmgr install biber

Both are optional — latexport auto-detects whether they are needed based on the source file.

Installation

# Clone the repository
git clone <repository-url>
cd latexport

# Install Python dependencies and register CLI commands
uv sync
uv pip install -e .

Usage

1. Process LaTeX Files

Convert one or more .tex files to HTML and PDF:

# Process a single file
uv run latexport tex_src/example.tex

# Process multiple files
uv run latexport tex_src/file1.tex tex_src/file2.tex

# Write output to a custom directory instead of output/
uv run latexport -o ./public tex_src/example.tex

# Override the output subdirectory name (single file only)
uv run latexport --name lecture-notes tex_src/example.tex
# → output goes to output/lecture-notes/ instead of output/example/

# Dry run (preview without changes)
uv run latexport -n tex_src/example.tex

This will:

  • Seed the output directory with shared assets from static/
  • Auto-detect whether bibliography processing (bibtex/biber) is needed
  • If \cite commands are present: run bibtex/biber before LaTeXML so citations resolve in HTML
  • Generate HTML at {output}/{stem}/index.html (via LaTeXML, with all latexml/*.ltxml bindings)
  • Generate PDF at {output}/{stem}/{stem}.pdf (via pdflatex, with bibtex/biber if needed)
  • Clean up auxiliary files (.aux, .log, .out, .bbl, .blg, .bcf, .run.xml)
  • Remove empty subdirectories left by pdflatex's \include handling
  • Inject custom CSS and JavaScript references
  • Replace QED symbols with accessible HTML
  • Consolidate local CSS files to the shared css/ folder

2. Generate Main Index Page

Create an index page listing all documents:

# Use the default output directory (from config.py)
uv run latexport-index

# Use a custom output directory
uv run latexport-index -o examples/output

This scans the output directory for index.html files and generates a main index with links to each document (and PDF if available).

4. Clean Up Log Files

Remove latexml.log files left behind by LaTeXML:

uv run latexport-clean

This removes latexml.log from the current directory and recursively from the output directory. During a normal latexport run these are cleaned up automatically; latexport-clean handles any leftovers from previous runs.

3. Bundle a Self-Contained HTML File

Inline all CSS and JS into a single portable file:

# Bundle with all assets inlined (CSS + JS) — default behaviour
uv run embed_assets.py output/example/index.html

# Bundle but skip remote assets (they remain as external references)
uv run embed_assets.py --skip-remote output/example/index.html

# Bundle CSS only — leave <script src> tags untouched
uv run embed_assets.py --skip-js output/example/index.html

# Write the bundled file to a custom path
uv run embed_assets.py output/example/index.html dist/standalone.html

Configuration

Edit config.py to customise paths and settings:

OUTPUT_DIR = Path("./output")    # Root directory for generated output
STATIC_DIR = Path("./static")    # Shared CSS/JS source; copied into output on each run
LATEXML_DIR = Path(__file__).parent / "latexml"  # LaTeXML binding files (absolute path)
SRC_QED_SYMBOL = "∎"             # QED symbol to replace in HTML
ENCODING = "utf-8"               # File encoding

# Index generator settings
ROOT_DIR = OUTPUT_DIR
PATTERN = "index.html"
TEMPLATE_PATH = Path("templates/main_index_template.html")

Examples

Live demos are published at https://kalv25.github.io/latexport/.

Single standalone file — testmath.tex

Source: latex3/latex2e — © American Mathematical Society / LaTeX Project, LPPL 1.3c.

A self-contained file with no \include dependencies. The stem is overridden so the output folder has a descriptive name rather than the generic testmath.

uv run latexport \
  -o examples/output \
  --name latex2e-testmath \
  examples/tex_src/testmath.tex

Output:

examples/output/latex2e-testmath/index.html
examples/output/latex2e-testmath/testmath.pdf

Live: https://kalv25.github.io/latexport/latex2e-testmath/


Multi-part document — hermish-proofs-notes/main.tex

Source: hermish/proofs-notes — CS70 lecture notes by Hermish Mehta.

A document split across multiple files via \include. latexport creates the required subdirectories for pdflatex, then removes them once they are empty after aux file cleanup.

uv run latexport \
  -o examples/output \
  --name hermish-proofs-notes \
  examples/tex_src/hermish-proofs-notes/main.tex

Output:

examples/output/hermish-proofs-notes/index.html
examples/output/hermish-proofs-notes/main.pdf

Live: https://kalv25.github.io/latexport/hermish-proofs-notes/


Generate the main index

After converting one or more documents, build the navigable index page:

uv run latexport-index -o examples/output

This scans examples/output/ and writes examples/output/index.html with links to each document (and its PDF where available).


Typical Workflow

  1. Write LaTeX — Create/edit .tex files in tex_src/
  2. Convert to HTML/PDF — Run uv run latexport tex_src/yourfile.tex
  3. Regenerate index — Run uv run latexport-index
  4. Deploy — Upload output/ to your web server

Customisation

Custom CSS

Edit static/css/custom.css. This file is automatically copied into the output directory and injected into every processed HTML file.

Custom JavaScript

Edit files in static/js/. The following are automatically injected:

  • custom.js — Page-width slider, MathJax toggle, go-to-top button
  • mathjax-config.js — MathJax configuration

Localising the toolbar (custom.js)

All user-visible strings in the toolbar are read from window.latexportI18n. To override them for another language, add a <script> block before custom.js loads:

<script>
  window.latexportI18n = {
      widthLabel:     'Breite',
      widthAriaLabel: 'Seitenbreite in ch-Einheiten',
      mathOn:         'Formel ✓',
      mathOff:        'Formel ✗',
      mathAriaOn:     'MathJax-Darstellung ein',
      mathAriaOff:    'MathJax-Darstellung aus',
      goToTopAria:    'Zum Seitenanfang',
  };
</script>

Only the keys you want to change need to be provided; omitted keys fall back to the English defaults.

LaTeXML Bindings

Custom LaTeXML behaviour is defined in .ltxml files inside latexml/. These are Perl modules loaded via --preload on every latexmlc invocation. All .ltxml files in latexml/ are loaded automatically (alphabetical order) — no changes to main.py needed when adding new ones.

Currently included:

  • amsmath-compat.ltxml — no-op stubs for amsmath internal commands (e.g. \ctagsplit@true) that would otherwise cause "undefined macro" errors.
  • emph-in-math.ltxml — redefines \emph{…} as \mathit{…} inside math environments, \textit{…} elsewhere.

To add a new binding, simply create a .ltxml file in latexml/.

Index Template

Edit templates/main_index_template.html. The template uses Python str.format-style placeholders:

Placeholder Default Description
{lang} en <html lang> attribute
{title} Documents <title> and <meta name="description">
{description} Document index Meta description content
{heading} Documents <h1> text
{contents_label} Contents <h3> section label
{links} (generated) Rendered <li> elements — filled automatically

To generate the index in another language, pass keyword arguments to create_main_index_page:

create_main_index_page(
    root_dir=Path("output"),
    lang="de",
    title="Dokumente",
    description="Dokumentenindex",
    heading="Dokumente",
    contents_label="Inhalt",
)

Caveats

SVG Dark Mode

SVG images (e.g., diagrams generated by TikZ) use a simple CSS filter to invert colours in dark mode. This works well for simple black-and-white diagrams but may produce unexpected results when multiple colours are used. Always test your documents in dark mode to verify SVG rendering.

LaTeXML Conversion Limitations

LaTeXML does not support all LaTeX packages and document structures. Known cases where HTML conversion fails or produces degraded output:

Multi-part documents — Projects where the root .tex file relies on a custom build system, non-standard \include chaining, or shared preamble files split across multiple directories may not convert correctly. LaTeXML resolves includes relative to --sourcedirectory; files outside that tree are not found.

memoir class — Documents using the memoir document class are not reliably converted. LaTeXML has limited support for memoir's extended sectioning, captioning, and page-layout commands. For example, the UiO Introduction to LaTeX repository uses memoir and fails to produce usable HTML output.

In these cases pdflatex still produces a correct PDF; only the HTML output is affected. Consider restructuring such documents to use a standard class (article, report, book) for full LaTeXML compatibility.

Contributing

Contributions are welcome — see CONTRIBUTING.md for setup instructions, code style, and how to submit a pull request.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

latexport-2026.3.2.tar.gz (37.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

latexport-2026.3.2-py3-none-any.whl (25.9 kB view details)

Uploaded Python 3

File details

Details for the file latexport-2026.3.2.tar.gz.

File metadata

  • Download URL: latexport-2026.3.2.tar.gz
  • Upload date:
  • Size: 37.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for latexport-2026.3.2.tar.gz
Algorithm Hash digest
SHA256 fa7a803aa4769c4c5facc1ea0c862826785fd7548cc01aa4c4f6f8abcedb57dd
MD5 78af9a0162cd48cae896020703c54f28
BLAKE2b-256 f4b8760076d164fdf67735e7be7f29116e05fdb1f9cd7a7c8a5fdca112a0bff8

See more details on using hashes here.

File details

Details for the file latexport-2026.3.2-py3-none-any.whl.

File metadata

  • Download URL: latexport-2026.3.2-py3-none-any.whl
  • Upload date:
  • Size: 25.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for latexport-2026.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 95ca49866a6c0b750b1b3be7969efc6a637681831da1b31fba74553229b780c6
MD5 dfab58c06014688d674d7e9b5171fcbf
BLAKE2b-256 073bf5cfd36f6d53e213d1490b8da5c65973bdb85d98a242f8af5cfad30fc793

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page