Skip to main content

Command line tool to extract review changes from a docx file as plain text

Project description


Command line tool to extract review changes and comments from a docx file as plain text. It is particullary usefull after do review changes in pdf files at docx editor (e.g., MS Word, gdocs).

How to install?

pip install docxreviews2txt

How to use it?

usage: docxreviews2txt [-h] [--version] docx

Command line tool to extract review changes from a docx file as plain text

positional arguments:
  docx        input docx

  -h, --help  show this help message and exit
  --version   show version


$ docxreviews2txt tests/lorem_ipsum.docx
txt reviews at file:///C:/Users/alan/src/docxreviews2txt/tests/lorem_ipsum_review.txt
$ cat c:/Users/alan/src/docxreviews2txt/tests/lorem_ipsum_review.txt
# comments
- This is a comment from docx
# Typos and rewriting suggestions
# Typos suggestions (using HTML tags <ins> and <del>)
- dolor sit amet, consectetur <ins>Lorem ipsum</ins><del>adipiscing</del>
- sit amet, consectetur adipiscing<ins>s</ins> elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim <ins>do</ins>
- Ut enim ad minim <ins>Lorem</ins>veniam<ins>ipsum</ins>
- dolor sit amet, consectetur <del>adipiscing</del>

Known issues

The tool fails to capture changes in Docx files with text organized in tables (e.g., pdf2docx converts columns to tables).


This project takes inspiration from:

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

docxreviews2txt-0.4.4-py3-none-any.whl (4.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page