Skip to main content

Additional tooling for Python-docx.

Project description

CMI-docx

Build codecov Ruff stability-stable LGPL--2.1 License pages

cmi-docx is a Python library by the Child Mind Institute that extends python-docx with higher-level tooling for .docx file manipulation. It provides two complementary APIs:

  • Imperative API -- wrapper classes around python-docx objects for find/replace, formatting, insertion, and comments on existing documents.
  • Declarative API -- an async-first, component-based system for constructing documents from scratch with conditional rendering, lazy evaluation, and template support.

Features

  • Find and replace across an entire document (body, headers, footers, and tables), even when the target text is split across multiple runs by Word's internal formatting.
  • Style-aware replacement -- apply bold, italic, underline, font size, color, and more to replacement text.
  • Paragraph insertion -- insert paragraphs by text, by object, or as images at any position in the document body.
  • Run-level formatting -- read and write formatting on individual runs (bold, italic, underline, superscript, subscript, font size, font color).
  • Paragraph formatting -- control alignment, line spacing, space before/after, and font properties for entire paragraphs.
  • Table and cell formatting -- toggle table sections, set cell background colors, and configure cell borders.
  • Word comments -- programmatically add comments to paragraphs, runs, or ranges of runs, with automatic comment preservation during text edits.
  • Declarative document construction -- build documents as a tree of Section, Paragraph, TextRun, Table, and ImageRun components, then render to a python-docx Document with await doc.to_docx().
  • Async and lazy evaluation -- declare children as coroutines or callables; they are resolved concurrently via asyncio.gather.
  • Conditional rendering -- attach a condition callable to any component to include or exclude it at render time without building its subtree.
  • Template support -- open an existing .docx as a template, apply placeholder replacements, and insert new content at a specific paragraph index.

Installation

Install from PyPI:

pip install cmi-docx

Quick start

Imperative API

The imperative API wraps python-docx objects with extension classes that add search, replace, formatting, and insertion capabilities.

Find and replace

import docx
from cmi_docx import ExtendDocument, RunStyle

doc = docx.Document()
paragraph = doc.add_paragraph("Hello {{")
paragraph.add_run("FULL_NAME}}")

extend_doc = ExtendDocument(doc)
extend_doc.replace("{{FULL_NAME}}", "Jane Doe", RunStyle(bold=True))

print(doc.paragraphs[0].text)  # "Hello *Jane Doe*"

Paragraph formatting

import docx
from cmi_docx import ExtendParagraph, ParagraphStyle
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = docx.Document()
paragraph = doc.add_paragraph("Formatted paragraph.")

ExtendParagraph(paragraph).format(
    ParagraphStyle(
        bold=True,
        italic=True,
        font_size=14,
        alignment=WD_ALIGN_PARAGRAPH.CENTER,
    )
)

Insert a run with formatting

import docx
from cmi_docx import ExtendParagraph, RunStyle

doc = docx.Document()
paragraph = doc.add_paragraph("")
paragraph.add_run("Hello ")
paragraph.add_run("world!")

ExtendParagraph(paragraph).insert_run(1, "beautiful ", RunStyle(bold=True))

print(paragraph.text)  # "Hello beautiful world!"

Add a comment

import docx
from cmi_docx import add_comment

document = docx.Document()
paragraph = document.add_paragraph("This needs review.")

add_comment(document, paragraph, "Reviewer", "Please check this section.")

Declarative API

The declarative API lets you build documents as a tree of components. All children are resolved concurrently, and components can be conditionally included or lazily constructed.

Simple document

import asyncio
from cmi_docx.declarative import Document, Section, Paragraph, TextRun

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[
                    Paragraph(text="Main Heading", heading=1),
                    Paragraph(
                        children=[
                            TextRun(text="Bold text", bold=True),
                            TextRun(text=" and "),
                            TextRun(text="italic text", italic=True),
                        ],
                    ),
                ],
            ),
        ],
        title="My Document",
        creator="Author Name",
    )

    docx_doc = await doc.to_docx()
    docx_doc.save("output.docx")

asyncio.run(main())

Async children

Components accept coroutines as children, which are resolved concurrently:

import asyncio
from cmi_docx.declarative import Document, Section, Paragraph

async def fetch_paragraph() -> Paragraph:
    await asyncio.sleep(0.1)  # Simulate an API call
    return Paragraph(text="Content fetched asynchronously")

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[
                    Paragraph(text="Static content"),
                    fetch_paragraph(),
                ],
            ),
        ],
    )

    docx_doc = await doc.to_docx()
    docx_doc.save("output.docx")

asyncio.run(main())

Conditional rendering

Attach a condition callable to skip components without building their subtree:

from cmi_docx.declarative import Document, Section, Paragraph

include_details = False

doc = Document(
    sections=[
        Section(
            children=[
                Paragraph(text="Always visible"),
                Paragraph(
                    text="Only shown when details are enabled",
                    condition=lambda: include_details,
                ),
            ],
        ),
    ],
)

Template-based documents

Open an existing .docx as a template, replace placeholders, and insert new content:

import asyncio
from pathlib import Path
from cmi_docx.declarative import Document, DocumentTemplate, Section, Paragraph

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[Paragraph(text="Inserted content")],
            ),
        ],
    )

    template = DocumentTemplate(
        path=Path("template.docx"),
        replacements={"{{NAME}}": "Alice", "{{DATE}}": "2025-01-01"},
        paragraph_index=1,  # Insert after the first template paragraph
    )

    docx_doc = await doc.to_docx(template=template)
    docx_doc.save("output.docx")

asyncio.run(main())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmi_docx-0.6.10.tar.gz (56.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cmi_docx-0.6.10-py3-none-any.whl (40.6 kB view details)

Uploaded Python 3

File details

Details for the file cmi_docx-0.6.10.tar.gz.

File metadata

  • Download URL: cmi_docx-0.6.10.tar.gz
  • Upload date:
  • Size: 56.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cmi_docx-0.6.10.tar.gz
Algorithm Hash digest
SHA256 8b7d4128217fa80254b9fee5a0c08903040f51609d702ee4f376886e1324a5e0
MD5 59bd75aed7480034ab06666a75b637d5
BLAKE2b-256 5dbcbaa4bfba2ba920efe06c3dc72e8c9aef0de3a6faaabda955d97f4c194637

See more details on using hashes here.

Provenance

The following attestation bundles were made for cmi_docx-0.6.10.tar.gz:

Publisher: pypi.yaml on childmindresearch/cmi-docx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cmi_docx-0.6.10-py3-none-any.whl.

File metadata

  • Download URL: cmi_docx-0.6.10-py3-none-any.whl
  • Upload date:
  • Size: 40.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cmi_docx-0.6.10-py3-none-any.whl
Algorithm Hash digest
SHA256 d2cc0ca1bf41790ad4bb09abed1422f730d259dff112bceedc247afd4668d02d
MD5 04520ec73562bcb370d5e357d705621a
BLAKE2b-256 34ebc2d58d7adf3897605b0a24196e1586d50a9100d9dca44c512c1fdf3e4dab

See more details on using hashes here.

Provenance

The following attestation bundles were made for cmi_docx-0.6.10-py3-none-any.whl:

Publisher: pypi.yaml on childmindresearch/cmi-docx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page