Markdown-driven MCP server to create, read and edit Word (.docx), Excel (.xlsx), PowerPoint (.pptx) and PDF documents — by the Touka project.

These details have not been verified by PyPI

Project description

mcp-docgen

A Markdown-driven Model Context Protocol (MCP) server to create, read and edit Word (.docx), Excel (.xlsx), PowerPoint (.pptx) and PDF documents.

Built entirely on mature, permissively-licensed Python libraries (python-docx, python-pptx, openpyxl, XlsxWriter, reportlab, pypdf, markdown-it-py) — no proprietary dependencies. MIT licensed.

Part of the Touka project: giving AI agents the ability to produce, read and edit real Office documents using only open-source building blocks.

Why

LLMs are great at producing Markdown. mcp-docgen converts Markdown to polished Office documents — and reads them back to Markdown — so an MCP-capable assistant (Claude Desktop, Touka, …) can run a full read → edit → write loop on real .docx / .xlsx / .pptx / .pdf files.

Install & run

uvx mcp-docgen          # once published to PyPI
# or, from a local checkout:
uv sync && uv run mcp-docgen

The server speaks MCP over stdio.

MCP client configuration

{
  "mcpServers": {
    "docgen": {
      "command": "uvx",
      "args": ["mcp-docgen"],
      "env": { "MCP_DOCGEN_OUTPUT_DIR": "/absolute/path/to/workdir" }
    }
  }
}

From a local checkout, swap the command for:

{ "command": "uv", "args": ["run", "--directory", "/path/to/mcp-docgen", "mcp-docgen"] }

Tools

Create (Markdown / structured data → file)

Tool	Input → Output
`create_docx(markdown, output_path, title?)`	Markdown → Word
`create_pptx(markdown, output_path, title?)`	Markdown → PowerPoint
`create_pdf(markdown, output_path, title?)`	Markdown → PDF
`create_xlsx(sheets, output_path)`	structured rows → Excel

Markdown features: headings, bold / italic / inline code, bullet & numbered lists (nested), tables, block quotes, fenced code blocks, horizontal rules.

PowerPoint slide convention: # Heading starts a new slide (its title); content below becomes bullet points; --- forces a slide break; title adds a leading title slide.

Excel sheets: [{ "name": str, "rows": [[cell, …], …], "header"?: bool }]. Cells may be strings / numbers / booleans / null; the first row is a bold, frozen header unless "header": false.

Read (file → Markdown / structured data)

Tool	Returns
`read_docx(input_path)`	`{ "markdown": … }`
`read_pptx(input_path)`	`{ "markdown": … }`
`read_xlsx(input_path)`	`{ "sheets": [{ "name", "rows" }] }` (round-trips with `create_xlsx`)
`read_pdf(input_path)`	`{ "num_pages", "pages": […], "text" }`

Reading docx/pptx to Markdown enables editing without in-place tools: read → edit the Markdown → create_* to regenerate.

Edit (in-place, preserving the rest)

Tool	Effect
`edit_xlsx(input_path, output_path, edits)`	set cells / append rows / add sheets, keeping other sheets, formulas & formatting
`append_docx(input_path, output_path, markdown)`	append Markdown content to the end
`append_pptx(input_path, output_path, markdown)`	append Markdown-derived slides to the end

edits = { "set_cells": [{"sheet","cell","value"}], "append_rows": [{"sheet","rows"}], "add_sheet": [{"name","rows"}] }.

PDF page operations

Tool	Effect
`pdf_merge(input_paths, output_path)`	concatenate PDFs in order
`pdf_split(input_path, output_dir?)`	one file per page
`pdf_extract(input_path, pages, output_path)`	extract a page subset (e.g. `"1-3,5"`)

Note on PDF "editing": clean open-source PDF editing means page operations (merge / split / extract), not reflowing or replacing body text — PDFs are not designed for in-place text editing. To revise PDF content, regenerate with create_pdf.

Create/edit tools return {"path": <absolute path>}; pdf_split returns {"paths": […]}.

Directories & safety

Output files are written inside MCP_DOCGEN_OUTPUT_DIR (default ./out).
Input files (read / edit) are read from MCP_DOCGEN_INPUT_DIR (default = the output dir), so a read → edit → write loop shares one working directory.
Every path is interpreted relative to its base; any path escaping it (via .. or an absolute path) is rejected, missing inputs and wrong suffixes raise errors.
The server makes no network calls and spawns no subprocesses.

Examples

uv run python examples/generate_samples.py   # create report.docx / review.pptx / sales.xlsx
uv run python examples/roundtrip_demo.py      # create → read → edit → PDF round-trip

Development

uv sync
uv run pytest
uv run ruff check .

License

Powered by python-docx, python-pptx, openpyxl, XlsxWriter, reportlab and pypdf; Markdown parsing by markdown-it-py. All MIT/BSD licensed.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Jun 10, 2026

0.1.0

Jun 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_docgen-0.2.0.tar.gz (16.6 kB view details)

Uploaded Jun 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcp_docgen-0.2.0-py3-none-any.whl (22.3 kB view details)

Uploaded Jun 10, 2026 Python 3

File details

Details for the file mcp_docgen-0.2.0.tar.gz.

File metadata

Download URL: mcp_docgen-0.2.0.tar.gz
Upload date: Jun 10, 2026
Size: 16.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.5 {"installer":{"name":"uv","version":"0.10.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcp_docgen-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`fc4907d08766ca81a8aee4508388f5ff3e6cd4788b6e83cf189618907ac5a1e1`
MD5	`c86bc63bec4d5bde889d4f5d594942a9`
BLAKE2b-256	`79ae1b8ae9b08223ba1ee02fccb07a6697d77e0a12c9c100f9ef8872c5edfc92`

See more details on using hashes here.

File details

Details for the file mcp_docgen-0.2.0-py3-none-any.whl.

File metadata

Download URL: mcp_docgen-0.2.0-py3-none-any.whl
Upload date: Jun 10, 2026
Size: 22.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.5 {"installer":{"name":"uv","version":"0.10.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcp_docgen-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cc56e5830f29d5fc735bb7ff418100f2d83157d1908efd445e2e48777ed1b74e`
MD5	`2b341c00df6fcb5e7eb41f0fcdc881c9`
BLAKE2b-256	`909d975543040e33ae04698d302d4d92a30520351e1d6bdffb4f6bed6ce7cdce`

See more details on using hashes here.

mcp-docgen 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

mcp-docgen

Why

Install & run

MCP client configuration

Tools

Create (Markdown / structured data → file)

Read (file → Markdown / structured data)

Edit (in-place, preserving the rest)

PDF page operations

Directories & safety

Examples

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes