Skip to main content

Additional tooling for Python-docx.

Project description

CMI-docx

Build codecov Ruff stability-stable LGPL--2.1 License pages

cmi-docx is a Python library by the Child Mind Institute that extends python-docx with higher-level tooling for .docx file manipulation. It provides two complementary APIs:

  • Imperative API -- wrapper classes around python-docx objects for find/replace, formatting, insertion, and comments on existing documents.
  • Declarative API -- an async-first, component-based system for constructing documents from scratch with conditional rendering, lazy evaluation, and template support.

Features

  • Find and replace across an entire document (body, headers, footers, and tables), even when the target text is split across multiple runs by Word's internal formatting.
  • Style-aware replacement -- apply bold, italic, underline, font size, color, and more to replacement text.
  • Paragraph insertion -- insert paragraphs by text, by object, or as images at any position in the document body.
  • Run-level formatting -- read and write formatting on individual runs (bold, italic, underline, superscript, subscript, font size, font color).
  • Paragraph formatting -- control alignment, line spacing, space before/after, and font properties for entire paragraphs.
  • Table and cell formatting -- toggle table sections, set cell background colors, and configure cell borders.
  • Word comments -- programmatically add comments to paragraphs, runs, or ranges of runs, with automatic comment preservation during text edits.
  • Declarative document construction -- build documents as a tree of Section, Paragraph, TextRun, Table, and ImageRun components, then render to a python-docx Document with await doc.to_docx().
  • Async and lazy evaluation -- declare children as coroutines or callables; they are resolved concurrently via asyncio.gather.
  • Conditional rendering -- attach a condition callable to any component to include or exclude it at render time without building its subtree.
  • Template support -- open an existing .docx as a template, apply placeholder replacements, and insert new content at a specific paragraph index.

Installation

Install from PyPI:

pip install cmi-docx

Quick start

Imperative API

The imperative API wraps python-docx objects with extension classes that add search, replace, formatting, and insertion capabilities.

Find and replace

import docx
from cmi_docx import ExtendDocument, RunStyle

doc = docx.Document()
paragraph = doc.add_paragraph("Hello {{")
paragraph.add_run("FULL_NAME}}")

extend_doc = ExtendDocument(doc)
extend_doc.replace("{{FULL_NAME}}", "Jane Doe", RunStyle(bold=True))

print(doc.paragraphs[0].text)  # "Hello *Jane Doe*"

Paragraph formatting

import docx
from cmi_docx import ExtendParagraph, ParagraphStyle
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = docx.Document()
paragraph = doc.add_paragraph("Formatted paragraph.")

ExtendParagraph(paragraph).format(
    ParagraphStyle(
        bold=True,
        italic=True,
        font_size=14,
        alignment=WD_ALIGN_PARAGRAPH.CENTER,
    )
)

Insert a run with formatting

import docx
from cmi_docx import ExtendParagraph, RunStyle

doc = docx.Document()
paragraph = doc.add_paragraph("")
paragraph.add_run("Hello ")
paragraph.add_run("world!")

ExtendParagraph(paragraph).insert_run(1, "beautiful ", RunStyle(bold=True))

print(paragraph.text)  # "Hello beautiful world!"

Add a comment

import docx
from cmi_docx import add_comment

document = docx.Document()
paragraph = document.add_paragraph("This needs review.")

add_comment(document, paragraph, "Reviewer", "Please check this section.")

Declarative API

The declarative API lets you build documents as a tree of components. All children are resolved concurrently, and components can be conditionally included or lazily constructed.

Simple document

import asyncio
from cmi_docx.declarative import Document, Section, Paragraph, TextRun

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[
                    Paragraph(text="Main Heading", heading=1),
                    Paragraph(
                        children=[
                            TextRun(text="Bold text", bold=True),
                            TextRun(text=" and "),
                            TextRun(text="italic text", italic=True),
                        ],
                    ),
                ],
            ),
        ],
        title="My Document",
        creator="Author Name",
    )

    docx_doc = await doc.to_docx()
    docx_doc.save("output.docx")

asyncio.run(main())

Async children

Components accept coroutines as children, which are resolved concurrently:

import asyncio
from cmi_docx.declarative import Document, Section, Paragraph

async def fetch_paragraph() -> Paragraph:
    await asyncio.sleep(0.1)  # Simulate an API call
    return Paragraph(text="Content fetched asynchronously")

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[
                    Paragraph(text="Static content"),
                    fetch_paragraph(),
                ],
            ),
        ],
    )

    docx_doc = await doc.to_docx()
    docx_doc.save("output.docx")

asyncio.run(main())

Conditional rendering

Attach a condition callable to skip components without building their subtree:

from cmi_docx.declarative import Document, Section, Paragraph

include_details = False

doc = Document(
    sections=[
        Section(
            children=[
                Paragraph(text="Always visible"),
                Paragraph(
                    text="Only shown when details are enabled",
                    condition=lambda: include_details,
                ),
            ],
        ),
    ],
)

Template-based documents

Open an existing .docx as a template, replace placeholders, and insert new content:

import asyncio
from pathlib import Path
from cmi_docx.declarative import Document, DocumentTemplate, Section, Paragraph

async def main() -> None:
    doc = Document(
        sections=[
            Section(
                children=[Paragraph(text="Inserted content")],
            ),
        ],
    )

    template = DocumentTemplate(
        path=Path("template.docx"),
        replacements={"{{NAME}}": "Alice", "{{DATE}}": "2025-01-01"},
        paragraph_index=1,  # Insert after the first template paragraph
    )

    docx_doc = await doc.to_docx(template=template)
    docx_doc.save("output.docx")

asyncio.run(main())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmi_docx-0.6.7.tar.gz (55.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cmi_docx-0.6.7-py3-none-any.whl (40.3 kB view details)

Uploaded Python 3

File details

Details for the file cmi_docx-0.6.7.tar.gz.

File metadata

  • Download URL: cmi_docx-0.6.7.tar.gz
  • Upload date:
  • Size: 55.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cmi_docx-0.6.7.tar.gz
Algorithm Hash digest
SHA256 7f5f3c921b13e01171fca2d1da3a6c4d483e7c9ddc1b7ef5d25082a1e1858e48
MD5 fbf533078588b4a117a529e0317bdfe1
BLAKE2b-256 26ea410c056a97b911e5da77529925965b692e623ad40ca71301bfe7af813282

See more details on using hashes here.

Provenance

The following attestation bundles were made for cmi_docx-0.6.7.tar.gz:

Publisher: pypi.yaml on childmindresearch/cmi-docx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cmi_docx-0.6.7-py3-none-any.whl.

File metadata

  • Download URL: cmi_docx-0.6.7-py3-none-any.whl
  • Upload date:
  • Size: 40.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cmi_docx-0.6.7-py3-none-any.whl
Algorithm Hash digest
SHA256 66cc8ab41a70a00257551a833c7614039490940a526d95f742662b7a71e02c87
MD5 f4a301381e8e6a4abbff029661f69311
BLAKE2b-256 f2c35db59eebbb3c6e141b4219a8599053da43a5e8bc7ff5440771b156f3ca56

See more details on using hashes here.

Provenance

The following attestation bundles were made for cmi_docx-0.6.7-py3-none-any.whl:

Publisher: pypi.yaml on childmindresearch/cmi-docx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page