Skip to main content

Additional tooling for Python-docx.

Project description

CMI-docx

Build codecov Ruff stability-stable LGPL--2.1 License pages

cmi-docx is a Python library by the Child Mind Institute that extends python-docx with higher-level tooling for .docx file manipulation. It provides two complementary APIs:

  • Imperative API -- wrapper classes around python-docx objects for find/replace, formatting, insertion, and comments on existing documents.
  • Declarative API -- an async-first, component-based system for constructing documents from scratch with conditional rendering, lazy evaluation, and template support.

Features

  • Find and replace across an entire document (body, headers, footers, and tables), even when the target text is split across multiple runs by Word's internal formatting.
  • Style-aware replacement -- apply bold, italic, underline, font size, color, and more to replacement text.
  • Paragraph insertion -- insert paragraphs by text, by object, or as images at any position in the document body.
  • Run-level formatting -- read and write formatting on individual runs (bold, italic, underline, superscript, subscript, font size, font color).
  • Paragraph formatting -- control alignment, line spacing, space before/after, and font properties for entire paragraphs.
  • Table and cell formatting -- toggle table sections, set cell background colors, and configure cell borders.
  • Word comments -- programmatically add comments to paragraphs, runs, or ranges of runs, with automatic comment preservation during text edits.
  • Declarative document construction -- build documents as a tree of Section, Paragraph, TextRun, Table, and ImageRun components, then render to a python-docx Document with await doc.to_docx().
  • Async and lazy evaluation -- declare children as coroutines or callables; they are resolved concurrently via asyncio.gather.
  • Conditional rendering -- attach a condition callable to any component to include or exclude it at render time without building its subtree.
  • Template support -- open an existing .docx as a template, apply placeholder replacements, and insert new content at a specific paragraph index.

Installation

Install from PyPI:

pip install cmi-docx

Quick start

Imperative API

The imperative API wraps python-docx objects with extension classes that add search, replace, formatting, and insertion capabilities.

Find and replace

import docx
from cmi_docx import ExtendDocument, RunStyle

doc = docx.Document()
paragraph = doc.add_paragraph("Hello {{")
paragraph.add_run("FULL_NAME}}")

extend_doc = ExtendDocument(doc)
extend_doc.replace("{{FULL_NAME}}", "Jane Doe", RunStyle(bold=True))

print(doc.paragraphs[0].text)  # "Hello *Jane Doe*"

Paragraph formatting

import docx
from cmi_docx import ExtendParagraph, ParagraphStyle
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = docx.Document()
paragraph = doc.add_paragraph("Formatted paragraph.")

ExtendParagraph(paragraph).format(
    ParagraphStyle(
        bold=True,
        italic=True,
        font_size=14,
        alignment=WD_ALIGN_PARAGRAPH.CENTER,
    )
)

Insert a run with formatting

import docx
from cmi_docx import ExtendParagraph, RunStyle

doc = docx.Document()
paragraph = doc.add_paragraph("")
paragraph.add_run("Hello ")
paragraph.add_run("world!")

ExtendParagraph(paragraph).insert_run(1, "beautiful ", RunStyle(bold=True))

print(paragraph.text)  # "Hello beautiful world!"

Add a comment

import docx
from cmi_docx import add_comment

document = docx.Document()
paragraph = document.add_paragraph("This needs review.")

add_comment(document, paragraph, "Reviewer", "Please check this section.")

Declarative API

The declarative API lets you build documents as a tree of components. All children are resolved concurrently, and components can be conditionally included or lazily constructed.

Simple document

import asyncio
from cmi_docx.declarative import Document, Section, Paragraph, TextRun

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[
                    Paragraph(text="Main Heading", heading=1),
                    Paragraph(
                        children=[
                            TextRun(text="Bold text", bold=True),
                            TextRun(text=" and "),
                            TextRun(text="italic text", italic=True),
                        ],
                    ),
                ],
            ),
        ],
        title="My Document",
        creator="Author Name",
    )

    docx_doc = await doc.to_docx()
    docx_doc.save("output.docx")

asyncio.run(main())

Async children

Components accept coroutines as children, which are resolved concurrently:

import asyncio
from cmi_docx.declarative import Document, Section, Paragraph

async def fetch_paragraph() -> Paragraph:
    await asyncio.sleep(0.1)  # Simulate an API call
    return Paragraph(text="Content fetched asynchronously")

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[
                    Paragraph(text="Static content"),
                    fetch_paragraph(),
                ],
            ),
        ],
    )

    docx_doc = await doc.to_docx()
    docx_doc.save("output.docx")

asyncio.run(main())

Conditional rendering

Attach a condition callable to skip components without building their subtree:

from cmi_docx.declarative import Document, Section, Paragraph

include_details = False

doc = Document(
    sections=[
        Section(
            children=[
                Paragraph(text="Always visible"),
                Paragraph(
                    text="Only shown when details are enabled",
                    condition=lambda: include_details,
                ),
            ],
        ),
    ],
)

Template-based documents

Open an existing .docx as a template, replace placeholders, and insert new content:

import asyncio
from pathlib import Path
from cmi_docx.declarative import Document, DocumentTemplate, Section, Paragraph

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[Paragraph(text="Inserted content")],
            ),
        ],
    )

    template = DocumentTemplate(
        path=Path("template.docx"),
        replacements={"{{NAME}}": "Alice", "{{DATE}}": "2025-01-01"},
        paragraph_index=1,  # Insert after the first template paragraph
    )

    docx_doc = await doc.to_docx(template=template)
    docx_doc.save("output.docx")

asyncio.run(main())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmi_docx-0.6.3.tar.gz (49.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cmi_docx-0.6.3-py3-none-any.whl (36.5 kB view details)

Uploaded Python 3

File details

Details for the file cmi_docx-0.6.3.tar.gz.

File metadata

  • Download URL: cmi_docx-0.6.3.tar.gz
  • Upload date:
  • Size: 49.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cmi_docx-0.6.3.tar.gz
Algorithm Hash digest
SHA256 71cffd42acaa9085ed8b2d034a018ab493ef249075538da9786fa3ff5a56ef79
MD5 1c628778709d0186a19f1dd56518f116
BLAKE2b-256 a1f70e9d38222bc204b22e20454883bc5e7cc07fb2e262b93ed4a32ed2879053

See more details on using hashes here.

Provenance

The following attestation bundles were made for cmi_docx-0.6.3.tar.gz:

Publisher: pypi.yaml on childmindresearch/cmi-docx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cmi_docx-0.6.3-py3-none-any.whl.

File metadata

  • Download URL: cmi_docx-0.6.3-py3-none-any.whl
  • Upload date:
  • Size: 36.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cmi_docx-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c64b97e16f64d6a8f2c78c1330f9d9183db67ab1547cb2e555b820bb12c94ef2
MD5 817c32974ed4e9403367bc953ae7d373
BLAKE2b-256 a4570dd16dcfe3f0c66c55ac08115bdfa569bf936625bf7362ec439d2ebabfec

See more details on using hashes here.

Provenance

The following attestation bundles were made for cmi_docx-0.6.3-py3-none-any.whl:

Publisher: pypi.yaml on childmindresearch/cmi-docx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page