Edit docx with python
Project description
docx-editor
Pure Python library for Word document track changes and comments, without requiring Microsoft Word.
Note: The PyPI package is named
docx-editorbecausedocx-editwas too similar to an existing package.
- Github repository: https://github.com/pablospe/docx-editor/
- Documentation: https://pablospe.github.io/docx-editor/
Features
- Hash-Anchored Paragraph References:
list_paragraphs()returns stable, hash-based paragraph IDs for safe, unambiguous targeting - Batch Editing: Atomic
batch_edit()with upfront hash validation across all operations - Paragraph Rewrite:
rewrite_paragraph()with automatic word-level diffing — specify desired text, get fine-grained tracked changes - Track Changes: Replace, delete, and insert text with revision tracking
- Cross-Boundary Editing: Find and replace text spanning multiple XML elements and revision boundaries
- Mixed-State Editing: Atomic decomposition for text spanning
<w:ins>/<w:del>boundaries - Comments: Add, reply, resolve, and delete comments
- Revision Management: List, accept, and reject tracked changes
- Cross-Platform: Works on Linux, macOS, and Windows
- No Dependencies: Only requires
defusedxmlfor secure XML parsing
Installation
pip install docx-editor
Claude Code Plugin
This repo includes a plugin for Claude Code that enables AI-assisted Word document editing.
This plugin extends the original Anthropic docx skill which requires Claude to manually manipulate OOXML. Instead, this plugin provides an interface (docx-editor) that handles all the complexity—Claude just calls simple Python methods like doc.replace() or doc.add_comment(), making document editing significantly faster and less error-prone.
Install as plugin
# Add the marketplace
/plugin marketplace add pablospe/docx-editor
# Install the plugin
/plugin install docx-editor@docx-editor-marketplace
# Install dependencies
pip install docx-editor python-docx
Manual install (alternative)
# Install dependencies
pip install docx-editor python-docx
# Copy skill to Claude Code skills directory
git clone https://github.com/pablospe/docx-editor /tmp/docx-editor
mkdir -p ~/.claude/skills
cp -r /tmp/docx-editor/skills/docx ~/.claude/skills/
rm -rf /tmp/docx-editor
Once installed, Claude Code can help you edit Word documents with track changes, comments, and revisions.
Quick Start
from docx_editor import Document
import os
author = os.environ.get("USER") or "Reviewer"
with Document.open("contract.docx", author=author) as doc:
# Step 1: List paragraphs with hash-anchored references
for p in doc.list_paragraphs():
print(p)
# Output: P1#a7b2| Introduction to the contract...
# P2#f3c1| The committee shall review...
# Step 2: Edit — each method returns the new paragraph ref
r = doc.replace("30 days", "60 days", paragraph="P2#f3c1")
doc.replace("net", "gross", paragraph=r) # chain without list_paragraphs()
doc.delete("obsolete text", paragraph="P5#d4e5")
doc.insert_after("Section 5", " (as amended)", paragraph="P3#b2c4")
# Rewrite entire paragraph (automatic word-level diff)
doc.rewrite_paragraph("P2#f3c1",
"The board shall approve the updated proposal.")
# Comments
doc.add_comment("Section 5", "Please review")
# Revision management
revisions = doc.list_revisions()
doc.accept_revision(revision_id=1)
doc.save()
Cross-Boundary Text Operations
Text in Word documents with tracked changes can span revision boundaries. docx-editor handles this transparently:
from docx_editor import Document
import os
author = os.environ.get("USER") or "Reviewer"
with Document.open("reviewed.docx", author="Editor") as doc:
# Get visible text (inserted text included, deleted excluded)
text = doc.get_visible_text()
# List paragraphs to find hash-anchored references
refs = doc.list_paragraphs()
# Find text across element boundaries
match = doc.find_text("Aim: To")
if match and match.spans_boundary:
print("Text spans a revision boundary")
# Replace works even across revision boundaries
doc.replace("Aim: To", "Goal: To", paragraph="P1#a7b2")
doc.save()
Batch Editing
Apply multiple edits atomically with upfront hash validation:
from docx_editor import Document, EditOperation
with Document.open("contract.docx", author="Editor") as doc:
refs = doc.list_paragraphs()
doc.batch_edit([
EditOperation(action="replace", find="old", replace_with="new", paragraph="P2#f3c1"),
EditOperation(action="delete", text="remove this", paragraph="P5#d4e5"),
])
doc.save()
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docx_editor-0.2.2.tar.gz.
File metadata
- Download URL: docx_editor-0.2.2.tar.gz
- Upload date:
- Size: 434.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5d03a82db3b7ed674da7dcf12b4d5e103f772778ead7e9c8d5ff649fb14dc2f
|
|
| MD5 |
5dd0c3921c0cdc1fb9e4173032a42ad4
|
|
| BLAKE2b-256 |
d8deb9f31aa5077a824fbde24b57d852f835fc9a5c0199d236f779a38e238f16
|
File details
Details for the file docx_editor-0.2.2-py3-none-any.whl.
File metadata
- Download URL: docx_editor-0.2.2-py3-none-any.whl
- Upload date:
- Size: 47.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75f32039555bba14232448f576e9d164e7700836920bae28a1fcb2d66c97f7ea
|
|
| MD5 |
bbe01df5c1662a03ce167ce2ad5e274b
|
|
| BLAKE2b-256 |
e738f633986042df0e6eaa0be1fe6d4d4bf7386d33e26494ab2d409819550160
|