Skip to main content

Complete Word document comment manipulation with threading and Word Online compatibility

Project description

docx-comments

PyPI version Python versions CI License: MIT

Python module for complete Word document comment manipulation - adding, replying, and resolving comments with full Word Online compatibility.

Problem

python-docx can read Word comments but cannot properly create or reply to them:

  • Creates comments.xml but no anchors in document.xml
  • Missing commentsExtended.xml (threading)
  • Missing commentsIds.xml (durable IDs)

Microsoft Graph API does NOT support Word comments (only Excel).

Solution

This module provides complete OOXML comment manipulation based on ECMA-376 / ISO/IEC 29500:

  • Add anchored comments to specific text ranges
  • Reply to existing comments (threaded)
  • Mark comments as resolved
  • Unresolve comments and toggle done status
  • Delete comments or entire threads
  • Move comment anchors to new locations
  • Full Word Online compatibility
  • Optional people.xml identity linkage (Word account presence)

Installation

pip install docx-comments

Usage

from docx import Document
from docx_comments import CommentManager, PersonInfo

doc = Document("document.docx")
mgr = CommentManager(doc)

# Author must be a PersonInfo object, not a raw string.

# Add anchored comment to text range
comment_id = mgr.add_comment(
    paragraph=doc.paragraphs[0],
    start_run=0,
    end_run=2,
    text="Please review this section",
    author=PersonInfo(author="Reviewer Name"),
    initials="RN",
    person=True,  # ensure people.xml entry exists for identity linkage
)

# Reply to existing comment
reply_id = mgr.reply_to_comment(
    parent_id=comment_id,
    text="Addressed in this revision",
    author=PersonInfo(author="Author Name"),
    initials="AN"
)

# Mark comment as resolved
mgr.resolve_comment(comment_id)

# Mark comment as unresolved
mgr.unresolve_comment(comment_id)

# Move a comment to a new paragraph
mgr.move_comment(
    comment_id=comment_id,
    paragraph=doc.paragraphs[1],
)

# Delete a comment thread (root + replies)
mgr.delete_thread(comment_id)

# List all comment threads
for thread in mgr.get_comment_threads():
    print(f"Root: {thread.root.text} by {thread.root.author}")
    for reply in thread.replies:
        print(f"  Reply: {reply.text} by {reply.author}")

doc.save("document_reviewed.docx")

Identity Linkage (people.xml)

Word maps w:comment/@w:author to account identity using word/people.xml. By default, this library does not create or modify people.xml unless you opt in.

# Create a minimal people.xml entry without presence metadata
person = mgr.ensure_person("Reviewer Name")

# Or fetch an existing person entry (raises if missing)
try:
    person = mgr.get_person("Reviewer Name")
except KeyError:
    person = mgr.ensure_person("Reviewer Name")

# Resolve a default author from the system or a DOCX source
person, initials = mgr.get_default_author_person()

# Merge people.xml entries from another document (adds missing authors only)
template_doc = Document("template.docx")
mgr.merge_people_from(template_doc, include_presence=False)

# Or request it when adding a comment
mgr.add_comment(
    paragraph=doc.paragraphs[0],
    text="Linked to people.xml",
    author=person,
    person=True,
)

# You can also pass a PersonInfo object from an existing people.xml
person = mgr.get_people()[0]
mgr.add_comment(
    paragraph=doc.paragraphs[0],
    text="Author from PersonInfo",
    author=person,
)

# Optional presence metadata (only if you explicitly supply it)
mgr.ensure_person(
    "Reviewer Name",
    presence={"provider_id": "provider", "user_id": "user"},
)

Note: Word comments are keyed by the author string (w:comment/@w:author). If two people share the same name string, Word does not provide a separate comment author ID to disambiguate them. Using people.xml presence metadata can improve account linkage, but cannot fully resolve same-name conflicts.

You can also point the resolver at a known DOCX (kept private) using an environment variable:

export DOCX_COMMENTS_AUTHOR_DOCX="/path/to/author-source.docx"

Then call:

person, initials = mgr.get_default_author_person(include_presence=True)

If the DOCX contains more than one w15:person entry, a warning is raised and the resolver falls back to system user info.

OOXML Parts Handled

This module manages five XML parts:

  1. comments.xml - Comment content and metadata
  2. document.xml - Anchors (commentRangeStart/End, commentReference)
  3. commentsExtended.xml - Threading (paraId, paraIdParent, done)
  4. commentsIds.xml - Durable IDs for persistence
  5. people.xml - Optional identity linkage (w15:person)

Requirements

  • Python 3.9+
  • python-docx >= 1.0.0
  • lxml

References

OOXML Specification

Comment Elements

Threading & Extended Parts

  • CommentEx Class - Office 2013 comment threading (paraId, paraIdParent, done)
  • commentsIds - Durable IDs specification (Office 2016+)

Related Libraries

  • python-docx - Python library for Word documents (foundation for this module)
  • Open XML SDK - Microsoft's .NET SDK for OOXML

Acknowledgements

This project was conceptualised by Ting Sun and implemented with the assistance of Claude Code (Anthropic's AI coding assistant) under his guidance. The collaboration involved iterative development of the OOXML comment handling logic, with Claude Code contributing to code implementation and Ting Sun providing architectural direction and domain expertise.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docx_comments-0.3.0.tar.gz (87.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docx_comments-0.3.0-py3-none-any.whl (24.6 kB view details)

Uploaded Python 3

File details

Details for the file docx_comments-0.3.0.tar.gz.

File metadata

  • Download URL: docx_comments-0.3.0.tar.gz
  • Upload date:
  • Size: 87.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for docx_comments-0.3.0.tar.gz
Algorithm Hash digest
SHA256 96ba6d0669126c522b302b87444c9e76a72465c4a5c97b614f05532bf2060b6c
MD5 dfbcc1ab2ea29606ccce2758173805a0
BLAKE2b-256 7abe68dc46a6bef1e7d39588404a9facf2c118a891e76b1a360489002e44ed30

See more details on using hashes here.

Provenance

The following attestation bundles were made for docx_comments-0.3.0.tar.gz:

Publisher: publish.yml on sunt05/docx-comments

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file docx_comments-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: docx_comments-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 24.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for docx_comments-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c4858319df3537404242e6b1a3c05ef9fd2877ed2d6f0db6e4dc7323d6e0fc3b
MD5 645b5b49cce3afdd1a91e66e1765d72d
BLAKE2b-256 86e12db9093cb84a7ef5be09e03a13c041ae81627eb566c41066a9c086e168ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for docx_comments-0.3.0-py3-none-any.whl:

Publisher: publish.yml on sunt05/docx-comments

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page