Skip to main content

Generate styled Word (.docx) documents from .spec.md files with optional Azure AI enrichment and image generation.

Project description

Document Generator

Reads a .spec.md file and produces a Word document (.docx) with styled headings, bullet lists, tables, images, AI-generated visuals, and AI-enriched content.

Adapted from microsoft/presentations — same spec format, Word output instead of PowerPoint.

Prerequisites

  • Python 3.10+ — Download
  • pip — included with Python; used to install dependencies
  • Azure Developer CLI (azd) — required for provisioning Azure infrastructure (Install)
  • Azure CLI (az) — run az login so DefaultAzureCredential can authenticate (Install)

Quick Start

python -m venv .venv
.venv\Scripts\Activate.ps1

pip install -r requirements.txt
python documents.py .speckit/specifications/example.spec.md

Project Structure

documents.py              # thin wrapper – delegates to src/
src/
├── __init__.py           # package exports (main, render, parse_spec)
├── cli.py                # argparse CLI entry point
├── spec_parser.py        # .spec.md → metadata + section list
├── style.py              # Style class (font sizes from front-matter)
├── sections.py           # section builder functions (one per layout)
├── images.py             # image generation via Azure AI REST endpoint
├── enrichment.py         # ContentUrl fetching & note enrichment via Azure AI
├── renderer.py           # orchestrates parsing → enrichment → images → docx
└── spec_writer.py        # serialize enriched spec back to .spec.md
tests/
├── test_cli.py           # CLI argument parsing
├── test_renderer.py      # end-to-end render pipeline
├── test_sections.py      # section builder functions
├── test_spec_parser.py   # .spec.md parsing
├── test_spec_writer.py   # spec round-trip writing
└── test_style.py         # Style resolution from front-matter

Features

  • Section types: title, content, section-header, two-column, resource-box
  • Static images: reference local files with **Image**: path
  • AI-generated images: describe an image with **ImagePrompt** — generated via the Azure AI image endpoint and cached locally
  • ContentUrls & enrichment: add **ContentUrls** per section to fetch reference content and auto-enrich both bullets and notes via Azure AI Inference
  • Enrichment caching: enriched sections are written back to the spec file with **Enriched**: true so subsequent builds skip re-enrichment (override with --refetch)
  • Style from spec: font sizes, colors configurable in the front-matter style: block
  • Versioned output: each build creates a new versioned .docx so previous runs are never overwritten

Spec File Format

Spec files use Markdown with YAML front matter:

---
title: My Document
subtitle: A subtitle
output: My_Document.docx
text_model: gpt-4o-mini
image_model: gpt-image-1.5
style:
  title_font_size: 28
  body_font_size: 11
  heading_font_size: 18
  heading_color: '#1F2937'
  accent_color: '#0078D4'
---

## [title] My Title

**Subtitle**: Author name here

---

## [content] Section Title

- Bullet one
- Bullet two

**Image**: images/diagram.png
**ImagePrompt**: A futuristic cityscape at sunset

**ContentUrls**:
- https://learn.microsoft.com/azure/ai-services/openai/overview

**Notes**: Additional context for this section.

Section Types

Type Description
title Cover page with large centred title + subtitle
content Heading + bullet list
section-header Page break with prominent section heading
two-column Side-by-side content via table (**Left**: / **Right**:)
resource-box Labelled resource tables with name/URL rows

CLI Reference

python documents.py <spec-file> [options]

positional arguments:
  spec                  Path to the .spec.md file

options:
  -o, --output-dir DIR  Output directory (default: output)
  --image-model MODEL   Image generation model name (overrides front-matter)
  --refetch             Re-fetch and regenerate all AI enrichments
  --sections SELECTION  Section numbers to generate (1-indexed).
                        Examples: '5', '3-7', '1,3,5-8'. Default: all.

Running Tests

pip install pytest
pytest tests/ -v

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit Contributor License Agreements.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ms_documents-0.1.0.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ms_documents-0.1.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file ms_documents-0.1.0.tar.gz.

File metadata

  • Download URL: ms_documents-0.1.0.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ms_documents-0.1.0.tar.gz
Algorithm Hash digest
SHA256 860b1e506635b21c0a6876475a0e3ec86958ccb0dbffde85cf332ae91eee115b
MD5 b076592ccfb9421f68cee02cc1c493f4
BLAKE2b-256 80bf875b7575683cff4ff781cac2e4fc94494f47644dcdfb3c9b22975f120b7c

See more details on using hashes here.

Provenance

The following attestation bundles were made for ms_documents-0.1.0.tar.gz:

Publisher: publish.yaml on microsoft/documents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ms_documents-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ms_documents-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ms_documents-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 28af467d199787a9133383b46a730759c4ff9c8008f69bd855c0c2121ee53f62
MD5 bb3de8b8a5a8ae54ffcaa64db53e39ad
BLAKE2b-256 df5eb1533bf28f2d56de33e5802be45f0df8aafd6fae987cf596c3b7a3c34ee8

See more details on using hashes here.

Provenance

The following attestation bundles were made for ms_documents-0.1.0-py3-none-any.whl:

Publisher: publish.yaml on microsoft/documents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page