Skip to main content

Convert LaTeX documents to Word (.docx) preserving TikZ diagrams, cover page, table of contents, acronyms, and bibliography.

Project description

tex2docx

Convert LaTeX documents to Word (.docx) while preserving TikZ diagrams, cover pages, table of contents, glossaries, and bibliography.

Academic and technical documents written in LaTeX often need to be shared as Word files for collaboration. tex2docx automates this conversion in a single command, rendering TikZ diagrams as high-resolution PNGs and keeping the rest as editable text.

๐Ÿ‡ช๐Ÿ‡ธ ยฟHablas espaรฑol? Lee la guรญa completa en espaรฑol.

โœจ Features

LaTeX Element Word Output
TikZ diagrams High-res PNG images (300 DPI default)
Cover page (\maketitle) Image โ€” pixel-perfect from PDF
Table of contents Image โ€” pixel-perfect from PDF
Glossary (\printacronyms) Image โ€” pixel-perfect from PDF
Bibliography (thebibliography) Editable numbered list [1], [2]โ€ฆ
Sections, lists, text Editable text
Tables Editable tables
Acronyms (\ac{X}, \acp{X}) Resolved to plain text
Bold, italic, URLs Preserved

๐Ÿ“‹ Prerequisites

Tool Purpose Install
Python 3.8+ Run the script python.org
pdflatex (MiKTeX or TeX Live) Compile LaTeX & TikZ miktex.org / tug.org/texlive
Pandoc LaTeX โ†’ Word conversion pandoc.org
PyMuPDF PDF โ†’ PNG rendering pip install PyMuPDF

๐Ÿš€ Quick Start

# Clone the repository
git clone https://github.com/scavero/tex2docx.git
cd tex2docx

# Install the package and its dependencies
pip install .

# Convert your document
python tex2docx.py your_document.tex

This generates your_document_word.docx automatically.

๐Ÿ“– Usage

Basic conversion

python tex2docx.py document.tex

Custom output filename

python tex2docx.py document.tex -o deliverable.docx

Higher resolution diagrams

python tex2docx.py document.tex --dpi 400

Skip cover/TOC/glossary extraction

For documents without a custom cover page or glossary:

python tex2docx.py document.tex --no-pages

Custom page extraction

Extract only specific pages (0-indexed) with custom labels:

# Only cover and TOC (no glossary)
python tex2docx.py document.tex --pages 0,1 --labels portada,indice

# Cover spans 2 pages, glossary on page 4
python tex2docx.py document.tex --pages 0,2,3 --labels portada,indice,glosario

All options

python tex2docx.py --help
usage: tex2docx.py [-h] [--output OUTPUT] [--dpi DPI] [--pages PAGES]
                   [--labels LABELS] [--no-pages] [--workdir WORKDIR] texfile

positional arguments:
  texfile               Input .tex file

options:
  --output, -o          Output .docx filename (default: <input>_word.docx)
  --dpi                 DPI for TikZ rendering (default: 300)
  --pages               Comma-separated 0-indexed page numbers (default: 0,1,2)
  --labels              Comma-separated labels for pages (default: portada,indice,glosario)
  --no-pages            Skip page extraction
  --workdir             Working directory for images (default: tikz_png)

โš™๏ธ How It Works

document.tex
    โ”‚
    โ”œโ”€[1] pdflatex โ”€โ”€โ–บ document.pdf (full compilation)
    โ”‚                     โ”‚
    โ”‚                [2] Extract pages as PNG (cover, TOC, glossary)
    โ”‚
    โ”œโ”€[3] Extract & compile each TikZ diagram as standalone PNG
    โ”‚
    โ”œโ”€[4] Replace \maketitle, \tableofcontents, \printacronyms
    โ”‚     with \includegraphics pointing to extracted PNGs
    โ”‚
    โ”œโ”€[4b] Convert \begin{thebibliography} to editable numbered list
    โ”‚
    โ”œโ”€[5] Resolve \ac{X} โ†’ X, \acp{X} โ†’ Xs
    โ”‚
    โ”œโ”€[6] Write clean intermediate .tex
    โ”‚
    โ””โ”€[7] pandoc -f latex -t docx โ”€โ”€โ–บ document_word.docx โœ…

๐Ÿ“ Generated Files

your_project/
โ”œโ”€โ”€ document.tex                โ† Your LaTeX source
โ”œโ”€โ”€ document.pdf                โ† Compiled PDF (step 1)
โ”œโ”€โ”€ document_word.docx          โ† Generated Word file โœ…
โ”œโ”€โ”€ document_intermediate.tex   โ† Intermediate file (can be deleted)
โ””โ”€โ”€ tikz_png/                   โ† Generated images (can be deleted)
    โ”œโ”€โ”€ page_portada.png
    โ”œโ”€โ”€ page_indice.png
    โ”œโ”€โ”€ page_glosario.png
    โ”œโ”€โ”€ fig_1.png
    โ”œโ”€โ”€ fig_2.png
    โ””โ”€โ”€ ...

๐Ÿ“ LaTeX Document Requirements

For best results, your .tex document should follow these conventions:

Element Convention
Cover \maketitle followed by \newpage
TOC \tableofcontents followed by \newpage
Glossary \printacronyms[...] followed by \newpage
Bibliography \begin{thebibliography}{99}...\end{thebibliography}
Acronyms \DeclareAcronym{X}{short = X, long = ...}
Diagrams \begin{tikzpicture}...\end{tikzpicture} in the document body

Note: TikZ pictures defined inside \newcommand in the preamble (e.g., logo placeholders) are automatically ignored. Only diagrams in the document body are processed.

๐Ÿค Contributing

Contributions are welcome! Feel free to:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Ideas for contributions

  • Support for beamer presentations
  • Custom Word template (.dotx) support
  • Support for pgfplots charts
  • GUI / web interface
  • Cross-references and \ref{} resolution
  • bibtex / biblatex support (in addition to thebibliography)

๐Ÿ“„ License

This project is licensed under the MIT License โ€” see the LICENSE file for details.

๐Ÿ’ก Tips

  • If you need the Word file to be 100% visually identical to the PDF, consider sharing the PDF directly. The Word output is ideal when others need to edit the text.
  • Use --dpi 400 for presentation-quality diagrams.
  • The intermediate .tex and tikz_png/ directory can be safely deleted after conversion.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tex2docx_converter-0.1.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tex2docx_converter-0.1.0-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file tex2docx_converter-0.1.0.tar.gz.

File metadata

  • Download URL: tex2docx_converter-0.1.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for tex2docx_converter-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5b41d71655b701b2ca5ffcaf9e7793bad592606f2f3db42ffcdda81f26c31ef9
MD5 1cf199f7bb7b5a45b332a9a4894b88d0
BLAKE2b-256 b2d73e6f5f2518910fde82e711dbfd093a29d7e97bada7e130172fad5d5ec966

See more details on using hashes here.

File details

Details for the file tex2docx_converter-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tex2docx_converter-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c389e6abe4950df6b5e3d6c675c2c68547e29a03bd1323cd5c9631db5dc546ce
MD5 483c456161b0c7c5da66f9a48858618d
BLAKE2b-256 35ad00486fabaf19cbb0a96eaf4aa14caf6ef2eeebab3aefa873ee30ec04a5f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page