Skip to main content

Convert LaTeX documents to Word (.docx) preserving TikZ diagrams, cover page, table of contents, acronyms, and bibliography.

Project description

tex2docx

PyPI version License: MIT Python 3.8+

Convert LaTeX documents to Word (.docx) while preserving TikZ diagrams, cover pages, table of contents, glossaries, and bibliography.

Academic and technical documents written in LaTeX often need to be shared as Word files for collaboration. tex2docx automates this conversion in a single command, rendering TikZ diagrams as high-resolution PNGs and keeping the rest as editable text.

๐Ÿ‡ช๐Ÿ‡ธ ยฟHablas espaรฑol? Lee la guรญa completa en espaรฑol.

โœจ Features

LaTeX Element Word Output
TikZ diagrams High-res PNG images (300 DPI default)
Cover page (\maketitle) Image โ€” pixel-perfect from PDF
Table of contents Image โ€” pixel-perfect from PDF
Glossary (\printacronyms) Image โ€” pixel-perfect from PDF
Bibliography (thebibliography) Editable numbered list [1], [2]โ€ฆ
Sections, lists, text Editable text
Tables Editable tables
Acronyms (\ac{X}, \acp{X}) Resolved to plain text
Bold, italic, URLs Preserved

๐Ÿ“‹ Prerequisites

Tool Purpose Install
Python 3.8+ Run the script python.org
pdflatex (MiKTeX or TeX Live) Compile LaTeX & TikZ miktex.org / tug.org/texlive
Pandoc LaTeX โ†’ Word conversion pandoc.org
PyMuPDF PDF โ†’ PNG rendering pip install PyMuPDF

๐Ÿš€ Quick Start

# Install from PyPI
pip install tex2docx-converter

# Convert your document
tex2docx your_document.tex

This generates your_document_word.docx automatically.

๐Ÿ“– Usage

Basic conversion

tex2docx document.tex

Custom output filename

tex2docx document.tex -o deliverable.docx

Higher resolution diagrams

tex2docx document.tex --dpi 400

Skip cover/TOC/glossary extraction

For documents without a custom cover page or glossary:

tex2docx document.tex --no-pages

Custom page extraction

Extract only specific pages (0-indexed) with custom labels:

# Only cover and TOC (no glossary)
tex2docx document.tex --pages 0,1 --labels portada,indice

# Cover spans 2 pages, glossary on page 4
tex2docx document.tex --pages 0,2,3 --labels portada,indice,glosario

All options

tex2docx --help
usage: tex2docx.py [-h] [--output OUTPUT] [--dpi DPI] [--pages PAGES]
                   [--labels LABELS] [--no-pages] [--workdir WORKDIR] texfile

positional arguments:
  texfile               Input .tex file

options:
  --output, -o          Output .docx filename (default: <input>_word.docx)
  --dpi                 DPI for TikZ rendering (default: 300)
  --pages               Comma-separated 0-indexed page numbers (default: 0,1,2)
  --labels              Comma-separated labels for pages (default: portada,indice,glosario)
  --no-pages            Skip page extraction
  --workdir             Working directory for images (default: tikz_png)

โš™๏ธ How It Works

document.tex
    โ”‚
    โ”œโ”€[1] pdflatex โ”€โ”€โ–บ document.pdf (full compilation)
    โ”‚                     โ”‚
    โ”‚                [2] Extract pages as PNG (cover, TOC, glossary)
    โ”‚
    โ”œโ”€[3] Extract & compile each TikZ diagram as standalone PNG
    โ”‚
    โ”œโ”€[4] Replace \maketitle, \tableofcontents, \printacronyms
    โ”‚     with \includegraphics pointing to extracted PNGs
    โ”‚
    โ”œโ”€[4b] Convert \begin{thebibliography} to editable numbered list
    โ”‚
    โ”œโ”€[5] Resolve \ac{X} โ†’ X, \acp{X} โ†’ Xs
    โ”‚
    โ”œโ”€[6] Write clean intermediate .tex
    โ”‚
    โ””โ”€[7] pandoc -f latex -t docx โ”€โ”€โ–บ document_word.docx โœ…

๐Ÿ“ Generated Files

your_project/
โ”œโ”€โ”€ document.tex                โ† Your LaTeX source
โ”œโ”€โ”€ document.pdf                โ† Compiled PDF (step 1)
โ”œโ”€โ”€ document_word.docx          โ† Generated Word file โœ…
โ”œโ”€โ”€ document_intermediate.tex   โ† Intermediate file (can be deleted)
โ””โ”€โ”€ tikz_png/                   โ† Generated images (can be deleted)
    โ”œโ”€โ”€ page_portada.png
    โ”œโ”€โ”€ page_indice.png
    โ”œโ”€โ”€ page_glosario.png
    โ”œโ”€โ”€ fig_1.png
    โ”œโ”€โ”€ fig_2.png
    โ””โ”€โ”€ ...

๐Ÿ“ LaTeX Document Requirements

For best results, your .tex document should follow these conventions:

Element Convention
Cover \maketitle followed by \newpage
TOC \tableofcontents followed by \newpage
Glossary \printacronyms[...] followed by \newpage
Bibliography \begin{thebibliography}{99}...\end{thebibliography}
Acronyms \DeclareAcronym{X}{short = X, long = ...}
Diagrams \begin{tikzpicture}...\end{tikzpicture} in the document body

Note: TikZ pictures defined inside \newcommand in the preamble (e.g., logo placeholders) are automatically ignored. Only diagrams in the document body are processed.

๐Ÿค Contributing

Contributions are welcome! Feel free to:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Ideas for contributions

  • Support for beamer presentations
  • Custom Word template (.dotx) support
  • Support for pgfplots charts
  • GUI / web interface
  • Cross-references and \ref{} resolution
  • bibtex / biblatex support (in addition to thebibliography)

๐Ÿ“„ License

This project is licensed under the MIT License โ€” see the LICENSE file for details.

๐Ÿ’ก Tips

  • If you need the Word file to be 100% visually identical to the PDF, consider sharing the PDF directly. The Word output is ideal when others need to edit the text.
  • Use --dpi 400 for presentation-quality diagrams.
  • The intermediate .tex and tikz_png/ directory can be safely deleted after conversion.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tex2docx_converter-0.1.1.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tex2docx_converter-0.1.1-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file tex2docx_converter-0.1.1.tar.gz.

File metadata

  • Download URL: tex2docx_converter-0.1.1.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for tex2docx_converter-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5671340c32ec5bb6fdeb927cfedff8558d75d8c2bf17516b8f8dcc1b46108a74
MD5 74b83b771ac146c6195f1d20ae04cf14
BLAKE2b-256 3b36a304bbc6da8acb15370a54d530df067903af8ae74c78c4d7042c14c0f318

See more details on using hashes here.

File details

Details for the file tex2docx_converter-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for tex2docx_converter-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3e7ee9ff898be10f1175c0fe557f0e759e7d88ac00ad5be2318dc293ccd1afb5
MD5 a02f50bff2ba0335129f40829acf4c4c
BLAKE2b-256 af976b6a74bfb033579cd42f3cf746bd0e1ce3b049cb509991e916eea6e6f0c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page