A Python library for working with Microsoft Word via COM automation
Project description
pymsword
Pymsword is a Python library for generating DOCX documents with simple templates.
Features
- Direct generation of DOCX documents with text and images — provides good performance and reliability.
- Jinja-like Placeholder Syntax — easy-to-use templating for text and images.
- COM Automation Support — integrate COM calls to extend document generation capabilities, at the cost of performance.
- MIT Licensed — free and open-source with a permissive license.
Template Syntax
Template syntax is inspired by Jinja, but only the small subset of features is implemented:
- Variables:
{{ variable }}for inserting variables. - Groups:
{% group_name %} ... {% end group_name %}for grouping content.
Variables
Use {{variable}} to insert values.
In the data, variables are defined as keys in a dictionary.
Escaping Braces
To escape double braces, use: {{"{{"}}.
Groups
Use {% group_name %} ... {% end group_name %} to define a group of content that can be repeated.
In the data, groups are defined as lists of dictionaries.
Extending group capture range
Due to the structure of DOCX documents, it is not possible to put text tags outside of the paragraph or table row. Consequently, simple groups can not be used to generate table rows or lists.
To overcome this limitation, group's range can be extended using additional markers:
- Use
{% group_name p %} ... {% group_name %}for a group that captures the whole paragraph or list item. - Use
{% group_name row %} ... {% group_name %}for a group that captures the whole table row. - Use
{% group_name cell %} ... {% group_name %}for a group that captures the whole cell of a table.
Example:
template = DocxTemplate("extend_groups.docx")
data = {
"items": [
{"text": "Item 1"},
{"text": "Item 2"},
{"text": "Item 3"},
]
}
template.generate(data, "extend_groups_result.docx")
DOCX Document Generation
Pymsword supports two modes of document generation: direct DOCX generation, that is done using pure Python code. It provides optimal performance and reliability, however it can only fill templates with plain text and images.
THe following example demonstrates how to use the library to generate a DOCX document with a template:
from pymsword.docx_template import DocxTemplate, DocxImageInserter
# Load the template
template = DocxTemplate("template.docx")
# Define the data for the template
data = {
"title": "My Document",
"content": "This is a sample document.",
"items": [
{"name": "Item 1", "value": 10},
{"name": "Item 2", "value": 20},
]
"image": DocxImageInserter("image.png"),
}
# Render the template with data
template.generate(data, "output.docx")
The template and the resulting document are shown below:
For the more complete example refer to the examples directory.
DOCX + COM Automation
More advanced documents can be generated using COM automation which gives access to the full range of Word features.
Its disadvantage is that it is significantly slower than direct DOCX generation, and requires Microsoft Word to be installed on the system.
To use COM automation, you need to install the pywin32 package and use the DocxComTemplate class.
To insert data using COM, put inserter fucntion instead of the value in the data. Inserter functions take single argument which is Word.Range object, and insert desired data into it.
Module pymsword.com_utilities provides some useful inserters.
from pymsword.docxcom_template import DocxComTemplate
from pymsword.com_utilities import table_inserter
# Load the template
template = DocxComTemplate("template.docx")
# Define the data for the template
data = {
"header": "My Document",
"table": table_inserter([["Col1", "Col2"], ["Row1", "Row2"]])
}
# Render the template with data
template.generate(data, "output.docx")
The template and the resulting document are shown below:
Complete list of inserter functions, available in pymsword.com_utilities module:
| Function | Description |
|---|---|
| `table_inserter(data:List[List[str]]) | Inserts a table with the given data. Each sublist represents a row. |
image_inserter(picture_path:str) |
Inserts an image from the specified path using COM. Supports more formats than the DOCX mode |
document_inserter(document_path:str) |
Inserts content of another document, can be DOCX, RTF or anything supported by Word. |
anchor_inserter(text:str, anchor:str) |
Inserts an anchor with the given text and name. |
heading_inserter(text:str, level:int=1) |
Inserts a heading with the given text and level. Level 1 is the highest level. |
COM Post-Processing
WHen generating documents using COM-assisted mode (DocxComTemplate), you can use post-processing to modify the document after it has been generated.
To do this, specify the postprocess argument when calling the DocxComTemplate.generate method.
Library pymsword.com_utilities provides an example post-processing function that updates document creation date and table of content:
from pymsword.docxcom_template import DocxComTemplate
from pymsword.com_utilities import update_document_toc
template = DocxComTemplate("template.docx")
data = ...
template.generate(
data,
"output.docx",
postprocess=update_document_toc
)
Post-processing function must take single argument which is the Word.Document object.
Requirements
- Python 3.7 or higher
pywin32package for COM automation- Microsoft Word installed (optional, for DOCX + COM mode)
- Pillow package for image handling
Installation
You can install Pymsword using pip:
pip install pymsword
License
This project is licensed under the MIT License - see the LICENSE.MIT file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pymsword-0.1.0.tar.gz.
File metadata
- Download URL: pymsword-0.1.0.tar.gz
- Upload date:
- Size: 24.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97d6e6c192489c066ea5c6cd7ebd16f0557e22faf941c49d2b6498be373b482f
|
|
| MD5 |
ed78dba3998ca869e32abe48fe6d19d0
|
|
| BLAKE2b-256 |
8d1d0f0ab33a131fe28f4a9a071a84e737280f785fde256ac620457085115857
|
File details
Details for the file pymsword-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pymsword-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
080d8d2ad3ae06d24719fd92e46aad23e1823494e8479f3aed3843b62a2bf79d
|
|
| MD5 |
bc409520d3e7d75c68c4ff4f7465c73e
|
|
| BLAKE2b-256 |
a684388fec4fc4daf48808b16bd815ea6dfbbe068e522c8bcd5e9108118cbaaf
|