Round-trip safe HWPX reader/editor/writer for Python
Project description
jakal-hwpx
jakal-hwpx is a Python library for reading, editing, and writing both HWPX and HWP documents.
The package exposes three main layers:
HancomDocument: a format-neutral document model for most authoring and conversion workHwpxDocument: direct HWPX package and XML editingHwpDocument: direct native HWP object and binary editing
Installation
Requires Python 3.11 or newer.
python -m pip install --upgrade pip
python -m pip install jakal-hwpx
For local development:
python -m pip install -e .[dev]
Package name on PyPI is jakal-hwpx. Import path is jakal_hwpx.
Quick start
For most application code, start with HancomDocument.
from jakal_hwpx import HancomDocument
doc = HancomDocument.blank()
doc.metadata.title = "Quarterly report"
doc.append_paragraph("Sales summary")
doc.append_table(
rows=2,
cols=2,
cell_texts=[["Item", "Value"], ["Q1", "120"]],
)
doc.write_to_hwpx("build/report.hwpx")
doc.write_to_hwp("build/report.hwp")
Reading an existing document uses the same model:
from jakal_hwpx import HancomDocument
doc = HancomDocument.read_hwp("input.hwp")
doc.append_paragraph("Review complete")
doc.write_to_hwpx("build/output.hwpx")
Direct HWPX authoring
Use HwpxDocument when you want explicit control over paragraphs, runs, controls, package parts, or validation.
from jakal_hwpx import HwpxDocument
doc = HwpxDocument.blank()
doc.append_paragraph("Inline math:")
doc.append_inline("equ", "x+y", width=3200, height=1800)
doc.append_inline("text", " = z")
doc.append_block("equ", "a+b", width=2800, height=1700)
doc.strict_validate()
doc.save("build/direct.hwpx")
There are now two authoring styles for controls:
append_block(type=..., content=..., **kwargs): insert a block-level controlappend_inline(type=..., content=..., **kwargs): reuse the target paragraph instead of creating a new one
Examples:
doc.append_block("equ", "x+y")
doc.append_inline("equ", "x+y")
doc.append_inline("text", " + z")
doc.append_block("table", [["A", "B"], ["1", "2"]])
Supported type aliases include:
text,paragrapheq,equ,equationpic,image,picturetable,tblbookmark,field,hyperlinknote,footnote,endnoteform,memo,chart,ole,shapeautonum,newnumheader,footer
The older explicit APIs still exist and remain the lowest-level stable surface:
append_equation()append_inline_equation()append_picture()append_shape()append_ole()append_table()
Choosing the right layer
| Layer | Use it for |
|---|---|
HancomDocument |
format-neutral authoring, conversion, block-level editing |
HwpxDocument |
direct HWPX package editing, XML placement, validation, control surgery |
HwpDocument |
direct native HWP object editing and low-level HWP inspection |
HwpBinaryDocument |
record tree, streams, DocInfo, and section model inspection |
Additional public API documentation lives in HWPX_MODULE.md.
Validation and release checks
Basic test run:
python -m pytest tests/test_document_model.py tests/test_hancom_document.py -q
Packaging and release validation:
python -m build
python -m twine check dist/*
python scripts/check_release.py --profile ci
On the Windows release machine, run the full Hancom gate as well:
python scripts/check_release.py --profile release
Documentation
- HWPX_MODULE.md: public module and API index
- docs/README.md: docs directory index
- docs/hancom-document.md:
HancomDocument - docs/hwpx-document.md:
HwpxDocument - docs/hwp-document.md:
HwpDocument - docs/bridge-and-binary.md: bridge and binary internals
- RELEASING.md: release checklist
License
Released under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jakal_hwpx-0.3.1.tar.gz.
File metadata
- Download URL: jakal_hwpx-0.3.1.tar.gz
- Upload date:
- Size: 2.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
150fde607b22e16525c695e36e0af7f9922da1a5c1298ac980d9509a495e7699
|
|
| MD5 |
930b0f25fb28cba2337be61cec49cb89
|
|
| BLAKE2b-256 |
00343e012cab5b8f35ea3ac900b11e8d301dde7d0975096b499119a81fe126bb
|
File details
Details for the file jakal_hwpx-0.3.1-py3-none-any.whl.
File metadata
- Download URL: jakal_hwpx-0.3.1-py3-none-any.whl
- Upload date:
- Size: 2.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09bc6691b152ddcef41a77442d93288b58435ba104988683943810314be8b124
|
|
| MD5 |
78087c9b2d89d4e5bae1e6b8015b7d54
|
|
| BLAKE2b-256 |
f04fe6da544db2d00f46d732daba0fb5795e0ef7f240e422362899411d741b41
|