Minimal DOCX rendering core for template, markdown, field refresh, and PDF conversion workflows
Project description
docxrender
docxrender is a small Python package for Word-first DOCX rendering.
Its core boundary is intentionally narrow:
file_template + context + markdown_body + DocxStyle -> DOCX -> PDF
The package owns technical rendering mechanics: DOCX template rendering, markdown body insertion, Word style application, DOCX field handling, and eventual LibreOffice-based PDF conversion. Product repositories own report content, workflow resource layout, section rendering, manifest schemas, figure selection, captions, and delivery directory semantics.
Capabilities
Current package surface:
- Public style/options/result dataclasses are available.
write_docx(...)can create a minimal DOCX from a DOCX template, context, markdown body, image assets, andDocxStyle.- Markdown support currently covers a CommonMark-ish shared subset: headings, paragraphs, hard line breaks, unordered lists, ordered lists, pipe tables, images, inline bold, inline code text cleanup, link-text reduction, page breaks, and spacers.
- Basic Word styling is applied from caller-provided
DocxStyle. - DOCX field update/freeze behavior is implemented through DOCX XML rewriting.
write_docx(...)can optionally refresh TOC/page fields through LibreOffice UNO whenDocxFieldRefreshOptionsis provided.convert_docx_to_pdf(...)converts through LibreOffice UNO when the external LibreOffice/UNO runtime is available.
Install
pdm install
Runtime dependencies are declared in pyproject.toml:
docxtplpython-docx
PDF conversion and DOCX field refresh are optional runtime features. They do require an external LibreOffice/UNO runtime.
libreoffice --headless --version
python -c "import uno"
On Debian or Ubuntu, that runtime is typically installed outside Python:
sudo apt install libreoffice python3-uno
Base DOCX writing with field_refresh=None does not import UNO and works
without LibreOffice.
Public API
The stable public API is exported from the package root. Product repositories
should prefer DocxRenderer for normal use. The dataclasses and module-level
functions remain public for advanced callers that want explicit contracts,
configuration adapters, or focused tests. Implementation modules such as
docxrender.markdown and docxrender.docx are technical layers and are not
compatibility-stable public contracts.
DocxRenderer follows value semantics. Every with_* call returns a new
renderer object and leaves the original unchanged. Runtime docxtpl objects,
including inline images, are materialized only by terminal methods such as
write_docx_template(...), write_docx(...), and write_pdf().
from docxrender import (
DocxRenderer,
DocxBodyAnchorOptions,
DocxBodyRenderPolicy,
DocxFieldMarkerOptions,
DocxFieldRefreshOptions,
DocxFontStyle,
DocxHeaderFooterImageOptions,
DocxMarkdownOptions,
DocxParagraphStyle,
DocxSizeStyle,
DocxStyle,
DocxTableStyle,
DocxTemplateContextPolicy,
DocxTemplateImageSpec,
DocxTemplateRenderOptions,
DocxWriteOptions,
write_docx_template,
write_docx,
)
DocxFieldMarkerOptions controls DOCX field update markers and field freezing
without LibreOffice or UNO:
DocxRenderer(file_docx=Path("report.docx")).with_field_update_markers(
should_update_fields=True,
should_freeze_fields=False,
).write_docx()
DocxFieldRefreshOptions is optional. Use it only when the caller has provided
a LibreOffice/UNO runtime and wants a DOCX whose TOC, page fields, or other
Word fields have been refreshed by LibreOffice:
DocxWriteOptions(
...,
field_refresh=DocxFieldRefreshOptions(
exe_libreoffice=Path("/usr/bin/libreoffice"),
dir_user_profile=Path("tmp/lo-profile"),
should_require_toc=True,
should_freeze_fields=True,
),
)
write_docx_template(...) is the generic docxtpl technical boundary. It
renders a DOCX template with caller context, optional inline images, and
optional default injections. It does not insert markdown bodies, apply DOCX
body styling, or run field/PDF post-processing:
from pathlib import Path
from docxrender import DocxTemplateRenderOptions, write_docx_template
result = write_docx_template(
DocxTemplateRenderOptions(
file_template=Path("template.docx"),
file_out_docx=Path("template-rendered.docx"),
context={"report_title": "Example Report"},
context_defaults={"body_anchor": "__REPORT_BODY_ANCHOR__"},
context_policy=DocxTemplateContextPolicy(
rule_conflict="caller_wins",
required_keys=("report_title",),
),
)
)
print(result.file_docx)
DocxTemplateContextPolicy controls:
- merge behavior between caller context and default injections
- conflict priority:
caller_winsordefaults_win - required keys that must exist after merge and before render
Minimal DocxRenderer DOCX write example:
from pathlib import Path
from docxrender import DocxRenderer
result = (
DocxRenderer()
.with_template(
file_template=Path("template.docx"),
context={"report_title": "Example Report"},
rule_conflict="caller_wins",
required_keys=("report_title",),
)
.with_fonts(
font_name_latin="Times New Roman",
font_name_body_east_asia="宋体",
font_name_heading_east_asia="宋体",
)
.with_sizes(
pt_title_page_title=36.0,
pt_title_page_meta=18.0,
pt_title_page_compiler=15.0,
pt_body=12.0,
pt_caption=10.5,
pt_table=12.0,
pt_heading_by_level={1: 16.0, 2: 14.0, 3: 12.0},
)
.with_table(
border_color="000000",
stripe_fill_color="D9D9D9",
border_size_main="12",
border_size_header="6",
line_spacing=1.5,
)
.with_paragraph(
line_spacing_body=1.5,
line_spacing_note=1.2,
first_line_indent_cm=0.74,
)
.with_header_footer_images(
file_header_image=Path("header.png"),
file_footer_image=Path("footer.png"),
idx_section_start=1,
)
.with_markdown(
should_parse_inline_bold=True,
should_parse_inline_code=True,
should_parse_links_as_text=True,
should_parse_image_width_attr=True,
default_image_width_pct=90.0,
)
.with_body_render_policy(
should_number_headings=False,
rule_ordered_list="word_style",
rule_unordered_list="word_style",
should_stripe_table_rows=False,
)
.with_body_anchor(rule_match="equals", rule_missing="raise")
.write_docx(
file_out_docx=Path("report.docx"),
markdown_body="# Summary **Bold**\n\nBody text with [link](https://example.com).",
dir_base=Path("."),
)
)
print(result.file_docx)
markdown_body is the already-rendered Markdown body to insert into the DOCX
template. dir_base is the base directory used to resolve relative image paths
inside that Markdown body.
DocxMarkdownOptions only covers the shared CommonMark-ish subset. Product
repositories should keep custom markdown dialects outside docxrender, either
through caller-side preprocessing or explicit higher-level options in their own
repo.
DocxBodyRenderPolicy controls structural rendering choices such as heading
numbering, Word-style versus plain-text lists, and striped table body rows. It
does not classify product-specific paragraphs.
DocxBodyAnchorOptions controls where the Markdown body is inserted. The search
is limited to top-level body paragraphs in the DOCX main document. equals
matches paragraph.text.strip() == anchor_token; contains matches templates
where the token is embedded in a larger paragraph. Missing anchors can either
append content or raise a template error.
DocxRenderer can also start from an existing DOCX and run only later
technical steps:
from pathlib import Path
from docxrender import DocxRenderer
DocxRenderer(file_docx=Path("report.docx")).with_field_refresh(
exe_libreoffice=Path("/usr/bin/libreoffice"),
dir_user_profile=Path("tmp/lo-profile"),
should_require_toc=True,
).write_docx()
Generic docxtpl inline-image binding is also supported through the same
template entrypoint:
from pathlib import Path
from docxrender import DocxRenderer, DocxTemplateImageSpec
renderer = DocxRenderer().with_template(
file_template=Path("template.docx"),
context={"report_title": "Example Report"},
inline_images={
"cover_image": DocxTemplateImageSpec(
file_image=Path("cover.png"),
width_mm=120,
),
},
)
The same renderer can convert the current DOCX to PDF:
from pathlib import Path
from docxrender import DocxRenderer
result = (
DocxRenderer(file_docx=Path("report.docx"))
.with_pdf_conversion(
exe_libreoffice=Path("/usr/bin/libreoffice"),
dir_user_profile=Path("tmp/lo-profile"),
file_out_pdf=Path("report.pdf"),
)
.write_pdf()
)
print(result.file_pdf)
Advanced explicit dataclass DOCX write example:
from pathlib import Path
from docxrender import (
DocxFontStyle,
DocxParagraphStyle,
DocxSizeStyle,
DocxStyle,
DocxTableStyle,
DocxWriteOptions,
write_docx,
)
style = DocxStyle(
fonts=DocxFontStyle(
font_name_latin="Times New Roman",
font_name_body_east_asia="宋体",
font_name_heading_east_asia="宋体",
),
sizes=DocxSizeStyle(
pt_title_page_title=36.0,
pt_title_page_meta=18.0,
pt_title_page_compiler=15.0,
pt_body=12.0,
pt_caption=10.5,
pt_table=12.0,
pt_heading_by_level={1: 16.0, 2: 14.0, 3: 12.0},
),
table=DocxTableStyle(
border_color="000000",
stripe_fill_color="D9D9D9",
border_size_main="12",
border_size_header="6",
line_spacing=1.5,
),
paragraph=DocxParagraphStyle(
line_spacing_body=1.5,
line_spacing_note=1.2,
first_line_indent_cm=0.74,
),
)
result = write_docx(
DocxWriteOptions(
file_template=Path("template.docx"),
file_out_docx=Path("report.docx"),
context={"report_title": "Example Report"},
markdown_body="# Summary\n\nBody text.",
dir_base=Path("."),
style=style,
)
)
print(result.file_docx)
The template should contain a paragraph whose text is the body anchor token:
{{ body_anchor }}
docxrender sets body_anchor in the template context when the caller does not
provide it.
Style Configuration
docxrender does not read TOML, JSON, YAML, or any other config file in its public
API. Callers convert their own configuration into DocxStyle.
The initial style model is based on:
/home/fqzhang/project/workflows/resources/common/report/style.toml
That file is a reference for fields and defaults, not a runtime dependency of the package.
Non-Goals
docxrender does not own:
- report manifest schemas
- workflow resource layout
- Jinja section discovery
- product-specific context builders
- figure registries or captions
Result/...delivery path semantics结果目录text generation- style config file readers
Tests
Run the current test suite:
pdm run python -m pytest -v
ty is available as an advisory type checker beside pyright:
pdm run ty check .
Pyright remains the primary type gate.
The suite currently covers public API construction, minimal DOCX writing,
markdown body insertion, basic style application, and the boundary that
docxrender does not import product repositories.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docxrender-0.1.1.tar.gz.
File metadata
- Download URL: docxrender-0.1.1.tar.gz
- Upload date:
- Size: 42.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1137dfdc854cb69ff39b7aaf7c3711f23a9bcd5c1ae9725de15b06919dcac8a7
|
|
| MD5 |
9bd6e40e08ff00208699f69df8332850
|
|
| BLAKE2b-256 |
8a4a6b0b74c1dda4dd9b5aacd48cc6c045c2760322c7c80374e86568500c4219
|
Provenance
The following attestation bundles were made for docxrender-0.1.1.tar.gz:
Publisher:
publish.yml on FuqingZh/docxrender
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docxrender-0.1.1.tar.gz -
Subject digest:
1137dfdc854cb69ff39b7aaf7c3711f23a9bcd5c1ae9725de15b06919dcac8a7 - Sigstore transparency entry: 1991925663
- Sigstore integration time:
-
Permalink:
FuqingZh/docxrender@37d93099f0d25408f9d637e114a1a0196c2f2f36 -
Branch / Tag:
refs/tags/0.1.1 - Owner: https://github.com/FuqingZh
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@37d93099f0d25408f9d637e114a1a0196c2f2f36 -
Trigger Event:
push
-
Statement type:
File details
Details for the file docxrender-0.1.1-py3-none-any.whl.
File metadata
- Download URL: docxrender-0.1.1-py3-none-any.whl
- Upload date:
- Size: 34.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3041a7c04dfe46e1241ce5a53f70db82710408573fc7632a77f2afd06d647d83
|
|
| MD5 |
ad82d84261075a535129bb087b0f381b
|
|
| BLAKE2b-256 |
389ae9e8d56a1420be41e8f0fcbb649567ae78920a1fe7035aad7f27daf6c7c6
|
Provenance
The following attestation bundles were made for docxrender-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on FuqingZh/docxrender
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docxrender-0.1.1-py3-none-any.whl -
Subject digest:
3041a7c04dfe46e1241ce5a53f70db82710408573fc7632a77f2afd06d647d83 - Sigstore transparency entry: 1991925733
- Sigstore integration time:
-
Permalink:
FuqingZh/docxrender@37d93099f0d25408f9d637e114a1a0196c2f2f36 -
Branch / Tag:
refs/tags/0.1.1 - Owner: https://github.com/FuqingZh
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@37d93099f0d25408f9d637e114a1a0196c2f2f36 -
Trigger Event:
push
-
Statement type: