Skip to main content

Additional tooling for Python-docx.

Project description

CMI-docx

Build codecov Ruff stability-stable LGPL--2.1 License pages

cmi-docx is a Python library by the Child Mind Institute that extends python-docx with higher-level tooling for .docx file manipulation. It provides two complementary APIs:

  • Imperative API -- wrapper classes around python-docx objects for find/replace, formatting, insertion, and comments on existing documents.
  • Declarative API -- an async-first, component-based system for constructing documents from scratch with conditional rendering, lazy evaluation, and template support.

Features

  • Find and replace across an entire document (body, headers, footers, and tables), even when the target text is split across multiple runs by Word's internal formatting.
  • Style-aware replacement -- apply bold, italic, underline, font size, color, and more to replacement text.
  • Paragraph insertion -- insert paragraphs by text, by object, or as images at any position in the document body.
  • Run-level formatting -- read and write formatting on individual runs (bold, italic, underline, superscript, subscript, font size, font color).
  • Paragraph formatting -- control alignment, line spacing, space before/after, and font properties for entire paragraphs.
  • Table and cell formatting -- toggle table sections, set cell background colors, and configure cell borders.
  • Word comments -- programmatically add comments to paragraphs, runs, or ranges of runs, with automatic comment preservation during text edits.
  • Declarative document construction -- build documents as a tree of Section, Paragraph, TextRun, Table, and ImageRun components, then render to a python-docx Document with await doc.to_docx().
  • Async and lazy evaluation -- declare children as coroutines or callables; they are resolved concurrently via asyncio.gather.
  • Conditional rendering -- attach a condition callable to any component to include or exclude it at render time without building its subtree.
  • Template support -- open an existing .docx as a template, apply placeholder replacements, and insert new content at a specific paragraph index.

Installation

Install from PyPI:

pip install cmi-docx

Quick start

Imperative API

The imperative API wraps python-docx objects with extension classes that add search, replace, formatting, and insertion capabilities.

Find and replace

import docx
from cmi_docx import ExtendDocument, RunStyle

doc = docx.Document()
paragraph = doc.add_paragraph("Hello {{")
paragraph.add_run("FULL_NAME}}")

extend_doc = ExtendDocument(doc)
extend_doc.replace("{{FULL_NAME}}", "Jane Doe", RunStyle(bold=True))

print(doc.paragraphs[0].text)  # "Hello *Jane Doe*"

Paragraph formatting

import docx
from cmi_docx import ExtendParagraph, ParagraphStyle
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = docx.Document()
paragraph = doc.add_paragraph("Formatted paragraph.")

ExtendParagraph(paragraph).format(
    ParagraphStyle(
        bold=True,
        italic=True,
        font_size=14,
        alignment=WD_ALIGN_PARAGRAPH.CENTER,
    )
)

Insert a run with formatting

import docx
from cmi_docx import ExtendParagraph, RunStyle

doc = docx.Document()
paragraph = doc.add_paragraph("")
paragraph.add_run("Hello ")
paragraph.add_run("world!")

ExtendParagraph(paragraph).insert_run(1, "beautiful ", RunStyle(bold=True))

print(paragraph.text)  # "Hello beautiful world!"

Add a comment

import docx
from cmi_docx import add_comment

document = docx.Document()
paragraph = document.add_paragraph("This needs review.")

add_comment(document, paragraph, "Reviewer", "Please check this section.")

Declarative API

The declarative API lets you build documents as a tree of components. All children are resolved concurrently, and components can be conditionally included or lazily constructed.

Simple document

import asyncio
from cmi_docx.declarative import Document, Section, Paragraph, TextRun

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[
                    Paragraph(text="Main Heading", heading=1),
                    Paragraph(
                        children=[
                            TextRun(text="Bold text", bold=True),
                            TextRun(text=" and "),
                            TextRun(text="italic text", italic=True),
                        ],
                    ),
                ],
            ),
        ],
        title="My Document",
        creator="Author Name",
    )

    docx_doc = await doc.to_docx()
    docx_doc.save("output.docx")

asyncio.run(main())

Async children

Components accept coroutines as children, which are resolved concurrently:

import asyncio
from cmi_docx.declarative import Document, Section, Paragraph

async def fetch_paragraph() -> Paragraph:
    await asyncio.sleep(0.1)  # Simulate an API call
    return Paragraph(text="Content fetched asynchronously")

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[
                    Paragraph(text="Static content"),
                    fetch_paragraph(),
                ],
            ),
        ],
    )

    docx_doc = await doc.to_docx()
    docx_doc.save("output.docx")

asyncio.run(main())

Conditional rendering

Attach a condition callable to skip components without building their subtree:

from cmi_docx.declarative import Document, Section, Paragraph

include_details = False

doc = Document(
    sections=[
        Section(
            children=[
                Paragraph(text="Always visible"),
                Paragraph(
                    text="Only shown when details are enabled",
                    condition=lambda: include_details,
                ),
            ],
        ),
    ],
)

Template-based documents

Open an existing .docx as a template, replace placeholders, and insert new content:

import asyncio
from pathlib import Path
from cmi_docx.declarative import Document, DocumentTemplate, Section, Paragraph

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[Paragraph(text="Inserted content")],
            ),
        ],
    )

    template = DocumentTemplate(
        path=Path("template.docx"),
        replacements={"{{NAME}}": "Alice", "{{DATE}}": "2025-01-01"},
        paragraph_index=1,  # Insert after the first template paragraph
    )

    docx_doc = await doc.to_docx(template=template)
    docx_doc.save("output.docx")

asyncio.run(main())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmi_docx-0.6.2.tar.gz (48.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cmi_docx-0.6.2-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file cmi_docx-0.6.2.tar.gz.

File metadata

  • Download URL: cmi_docx-0.6.2.tar.gz
  • Upload date:
  • Size: 48.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cmi_docx-0.6.2.tar.gz
Algorithm Hash digest
SHA256 26f7b155335bd26767052cfc4a2fcf8dc03a26d15ffedb1abf00833835687b84
MD5 c9a9f99f1ff18d4bc574393f852bead1
BLAKE2b-256 7c785d22d63d84d8479ec081031c072c722f0cc4fd4ab80f5f231e57a67824a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for cmi_docx-0.6.2.tar.gz:

Publisher: pypi.yaml on childmindresearch/cmi-docx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cmi_docx-0.6.2-py3-none-any.whl.

File metadata

  • Download URL: cmi_docx-0.6.2-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cmi_docx-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3218d0d08548837db0a8d3d897e7c8d5e2f24705f404ee10ce85395fbabf5bd6
MD5 37d32ee479a942036ffecef0762e101d
BLAKE2b-256 5c60a8850285dab40c90ab8897aeade0e6fdad70ce7b84a248746cbdb62b77b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for cmi_docx-0.6.2-py3-none-any.whl:

Publisher: pypi.yaml on childmindresearch/cmi-docx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page