Command line tool to extract review changes and comments from a docx file as plain text.
Project description
docxreviews2txt
Command line tool to extract review changes and comments from a docx file as plain text. It is particullary usefull after do review changes in pdf files at docx editor (e.g., MS Word, gdocs).
How to install?
pip install docxreviews2txt
How to use it?
usage: docxreviews2txt [-h] [--save_p_xml] [--version] docx
Extract review changes and comments from a docx file as plain text.
positional arguments:
docx input docx
optional arguments:
-h, --help show this help message and exit
--save_p_xml also save extracted Docx paragraphs as xml for debugging
--version show version
Example:
$ docxreviews2txt tests/lorem_ipsum.docx
txt reviews at file:///C:/Users/alan/src/docxreviews2txt/tests/lorem_ipsum_review.txt
$ cat c:/Users/alan/src/docxreviews2txt/tests/lorem_ipsum_review.txt
# comments
- This is a comment from docx
# Typos and rewriting suggestions
- sit amet, consectetur -> sit amet, consectetur Lorem ipsum
- sit amet, consectetur adipiscing elit, sed do -> sit amet, consectetur elit, sed do
- sit amet, consectetur adipiscing elit, sed -> sit amet, consectetur adipiscings elit, sed
- enim ad minim veniam, quis nostrud -> enim ad minim do veniam, quis nostrud
- enim ad minim veniam -> enim ad minim Lorem veniam
- veniam, quis nostrud -> veniam ipsum, quis nostrud
- sit amet, consectetur adipiscing elit, sed do -> sit amet, consectetur elit, sed do
TODO
- improve N words extractions for reviews changes and enable pass it as a param
- organized extracted reviews by the input Docx headings
- save txt as Docx to enable editing
- support drag-and-drop GUI
Known issues
The tool fails to capture changes in Docx files with text organized in tables (e.g., pdf2docx converts columns to tables).
ChangeLog
- v0.4: add main.py, rm --save_xml_p_elems, -nwords
- v0.3: add --version
- v0.2: add python module and unittests
- v0.1: one-script intial version
References
This project takes inspiration from:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for docxreviews2txt-0.4.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 712e9373401362f9cc3001670ab6245831ebe32145430f4782ed32d20e80e0bf |
|
MD5 | 59394c50993ad504014ea215966fc880 |
|
BLAKE2b-256 | 9cefde1a1d9440ac134f755866eca9b554d8413b38a6fd170fe0bff0497debd7 |