Extract reviewer comments from .docx files and insert them inline with the text
Project description
docx-comments-to-text
Extract reviewer comments from .docx files and insert them inline with the text they reference, creating a plain text output that keeps feedback in context.
Installation
From PyPI (recommended)
pip install docx-comments-to-text
From source
# Clone the repository
git clone https://github.com/platelminto/docx-comments-to-text
cd docx-comments-to-text
# Install in development mode
uv sync --dev
# or: pip install -e .
Usage
Command Line Interface
# Basic usage - output to stdout
docx-comments-to-text document.docx
# Save to file
docx-comments-to-text document.docx -o output.txt
# Control author display
docx-comments-to-text document.docx --authors never # Hide authors
docx-comments-to-text document.docx --authors always # Always show authors
docx-comments-to-text document.docx --authors auto # Show authors when multiple exist (default)
# Control comment placement
docx-comments-to-text document.docx --placement inline # Inline with text (default)
docx-comments-to-text document.docx --placement end-paragraph # At end of each paragraph
docx-comments-to-text document.docx --placement comments-only # Comments only with context
Development Usage
If working from source:
# Run with uv
uv run docx-comments-to-text document.docx
# Or use module syntax
uv run python -m docx_comments_to_text.cli document.docx
Example Output
Inline placement (default)
Original text with [reviewer feedback] [COMMENT: "This needs clarification"] continues here.
More content [needs examples] [COMMENT John: "Consider adding examples"] and final text.
End-paragraph placement
Original text with reviewer feedback[1] continues here.
More content needs examples[2] and final text.
Comments:
1. This needs clarification
2. John: Consider adding examples
Comments-only placement
"reviewer feedback": This needs clarification
"needs examples": John: Consider adding examples
Features
- Accurate comment positioning and text preservation
- Handles overlapping comments and multiple comment types
- Configurable author display
- Multiple comment placement styles (inline, end-of-paragraph, comments-only)
Technical Details
DOCX Structure
- DOCX files are ZIP archives containing XML files
word/document.xml- main document contentword/comments.xml- comment definitions- Comment ranges marked with
<w:commentRangeStart>and<w:commentRangeEnd>
Comment Insertion Strategy
- Parse document XML to extract text and track character positions
- Map comment ranges to their start/end positions in the text
- Sort comments by position for safe insertion (reverse order)
- Wrap commented text in brackets:
[commented text] - Insert comment content after bracketed text:
[COMMENT: "feedback"]
Dependencies
python-docx- DOCX file handlinglxml- XML parsingclick- Command line interface
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docx_comments_to_text-0.2.1.tar.gz.
File metadata
- Download URL: docx_comments_to_text-0.2.1.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f69a567a68ab3f9ea38574f9163387287b0978b014bc016e2d00eac6ca1888c1
|
|
| MD5 |
aa79b704c6bf57f0341de1693c2981a7
|
|
| BLAKE2b-256 |
5e5711836e9cd22d594d018bd3c6de9e6c86f20ab0031acf8354e8288a7c6ce5
|
File details
Details for the file docx_comments_to_text-0.2.1-py3-none-any.whl.
File metadata
- Download URL: docx_comments_to_text-0.2.1-py3-none-any.whl
- Upload date:
- Size: 9.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50bb1556e9866726a55da94b960f264e57ca302aeb2b07fdbcc24edaf8d80a0b
|
|
| MD5 |
8d7c6b50f2023d8a36797fff0857a4ab
|
|
| BLAKE2b-256 |
3ec1f034d3c0a669a81f5695c1fd6debcd3910c97f7f03136ee2cd158bc0d413
|