Skip to main content

YAML-to-Word document generator for automated pipelines and LLM-driven content creation

Project description

docsmith

YAML-in, Word-out. Define document content in a simple YAML schema and docsmith forges it into a professionally formatted Word (.docx) file.

Why docsmith?

docsmith is built for automated pipelines and LLM-driven content generation. The YAML input format is deliberately simple -- flat, predictable, and easy for any language model or script to produce. No template engine, no programming required. Hand it a YAML file (or pipe it via stdin), get a Word document back.

Use cases:

  • LLM document generation -- have an AI produce structured YAML, then render to Word
  • CI/CD pipelines -- generate reports, proposals, or compliance documents as build artifacts
  • Batch processing -- convert a directory of YAML files to Word in one pass
  • Content-first authoring -- focus on content in YAML, let docsmith handle formatting

Installation

pipx install docsmith

Or with pip:

pip install docsmith

Usage

# Generate a Word document from a YAML file
docsmith input.yaml

# Specify output directory
docsmith input.yaml -o output/

# Pipe YAML from stdin
cat input.yaml | docsmith -

# Pipe directly from an LLM
llm "write a project status report in docsmith YAML format" | docsmith -

When reading from stdin (-), the output file is docsmith_output.docx in the current directory. Use -o to override the output directory.

Also works as a Python module:

python -m docsmith input.yaml

YAML Document Format

title: "Document Title"
subtitle: "Subtitle text"
status: "Draft"

content:
  - heading: "Section Heading"
    level: 1

  - text: "Paragraph with **bold** and *italic* support."

  - bullets:
      - "First bullet point"
      - "Second bullet with **bold**"

  - numbered:
      - "Step one"
      - "Step two"

  - table:
      headers: ["Column A", "Column B"]
      rows:
        - ["Cell 1", "Cell 2"]
        - ["Cell 3", "Cell 4"]

  - image:
      path: "diagram.png"
      width: 4.0
      alignment: center
      caption: "Figure 1: System architecture"

  - decision: "A decision callout that needs stakeholder input"

Supported Block Types

Block Purpose
heading Section heading (level 1-4)
text Paragraph with inline bold/italic
bullets Unordered list
numbered Ordered list
table Table with headers and rows
image Embedded PNG/JPEG image
decision Red decision callout

Image Block Options

Option Required Default Description
path Yes -- File path relative to the YAML file, or absolute
width No 5.0 Width in inches (aspect ratio preserved)
alignment No left left, center, or right
caption No -- Caption text displayed below the image in italic

Document Metadata

docsmith sets SharePoint/OneDrive-compatible document properties:

  • dc:creator and cp:lastModifiedBy set to "docsmith"
  • dcterms:created and dcterms:modified set to generation timestamp
  • dc:title and dc:subject populated from YAML metadata

Development

git clone https://github.com/dawsonlp/docsmith.git
cd docsmith
uv venv .venv
source .venv/bin/activate
uv pip install -e ".[dev]"

# Install pre-commit hooks (ruff format + lint)
pre-commit install

CI runs ruff lint/format checks and pytest on every push and PR. Releases to PyPI are automated via GitHub Actions trusted publishing on tagged releases.

Future Output Formats

Word is the first format. The YAML source schema is designed to be renderable to multiple output formats (PDF, HTML, Markdown) in future versions.

License

GPL-3.0-or-later. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docsmith-1.1.0.tar.gz (60.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docsmith-1.1.0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file docsmith-1.1.0.tar.gz.

File metadata

  • Download URL: docsmith-1.1.0.tar.gz
  • Upload date:
  • Size: 60.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for docsmith-1.1.0.tar.gz
Algorithm Hash digest
SHA256 815a976586e78038a6581cc2f3f82c0e99b45c031ac0d77b12a8156c37cc0bda
MD5 2401010546c38710dac2fbd45828f099
BLAKE2b-256 9f815e6d2a30e2ce9dd5e85d11d6b659a421be9189f3c5cd3008fac77c8dc7e4

See more details on using hashes here.

Provenance

The following attestation bundles were made for docsmith-1.1.0.tar.gz:

Publisher: publish_to_pypi.yaml on dawsonlp/docsmith

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file docsmith-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: docsmith-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for docsmith-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8bffe1901c63fc552e62267c3a8beff33f75d71226a6f2556d261122037e3f86
MD5 1b76d34b7240b5b60644fe9a5e432c3c
BLAKE2b-256 a37f34f1f702926a49849ca59c4d1f07ae40b9905aae012c125ba85c51dca1c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for docsmith-1.1.0-py3-none-any.whl:

Publisher: publish_to_pypi.yaml on dawsonlp/docsmith

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page