Skip to main content

Round-trip safe HWPX reader/editor/writer for Python

Project description

jakal-hwpx

jakal-hwpx is a Python library for reading, editing, and writing both HWPX and HWP documents.

The package exposes three main layers:

  • HancomDocument: a format-neutral document model for most authoring and conversion work
  • HwpxDocument: direct HWPX package and XML editing
  • HwpDocument: direct native HWP object and binary editing

Module layers

Installation

Requires Python 3.11 or newer.

python -m pip install --upgrade pip
python -m pip install jakal-hwpx

For local development:

python -m pip install -e .[dev]

Package name on PyPI is jakal-hwpx. Import path is jakal_hwpx.

Quick start

For most application code, start with HancomDocument.

from jakal_hwpx import HancomDocument

doc = HancomDocument.blank()
doc.metadata.title = "Quarterly report"
doc.append_paragraph("Sales summary")
doc.append_table(
    rows=2,
    cols=2,
    cell_texts=[["Item", "Value"], ["Q1", "120"]],
)

doc.write_to_hwpx("build/report.hwpx")
doc.write_to_hwp("build/report.hwp")

Reading an existing document uses the same model:

from jakal_hwpx import HancomDocument

doc = HancomDocument.read_hwp("input.hwp")
doc.append_paragraph("Review complete")
doc.write_to_hwpx("build/output.hwpx")

Direct HWPX authoring

Use HwpxDocument when you want explicit control over paragraphs, runs, controls, package parts, or validation.

from jakal_hwpx import HwpxDocument

doc = HwpxDocument.blank()
doc.append_paragraph("Inline math:")
doc.append_inline("equ", "x+y", width=3200, height=1800)
doc.append_inline("text", " = z")
doc.append_block("equ", "a+b", width=2800, height=1700)

doc.strict_validate()
doc.save("build/direct.hwpx")

There are now two authoring styles for controls:

  • append_block(type=..., content=..., **kwargs): insert a block-level control
  • append_inline(type=..., content=..., **kwargs): reuse the target paragraph instead of creating a new one

Examples:

doc.append_block("equ", "x+y")
doc.append_inline("equ", "x+y")
doc.append_inline("text", " + z")
doc.append_block("table", [["A", "B"], ["1", "2"]])

Supported type aliases include:

  • text, paragraph
  • eq, equ, equation
  • pic, image, picture
  • table, tbl
  • bookmark, field, hyperlink
  • note, footnote, endnote
  • form, memo, chart, ole, shape
  • autonum, newnum
  • header, footer

The older explicit APIs still exist and remain the lowest-level stable surface:

  • append_equation()
  • append_inline_equation()
  • append_picture()
  • append_shape()
  • append_ole()
  • append_table()

Choosing the right layer

Layer Use it for
HancomDocument format-neutral authoring, conversion, block-level editing
HwpxDocument direct HWPX package editing, XML placement, validation, control surgery
HwpDocument direct native HWP object editing and low-level HWP inspection
HwpBinaryDocument record tree, streams, DocInfo, and section model inspection

Additional public API documentation lives in HWPX_MODULE.md.

Validation and release checks

Basic test run:

python -m pytest tests/test_document_model.py tests/test_hancom_document.py -q

Packaging and release validation:

python -m build
python -m twine check dist/*
python scripts/check_release.py --profile ci

On the Windows release machine, run the full Hancom gate as well:

python scripts/check_release.py --profile release

Documentation

License

Released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jakal_hwpx-0.3.1.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jakal_hwpx-0.3.1-py3-none-any.whl (2.2 MB view details)

Uploaded Python 3

File details

Details for the file jakal_hwpx-0.3.1.tar.gz.

File metadata

  • Download URL: jakal_hwpx-0.3.1.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for jakal_hwpx-0.3.1.tar.gz
Algorithm Hash digest
SHA256 150fde607b22e16525c695e36e0af7f9922da1a5c1298ac980d9509a495e7699
MD5 930b0f25fb28cba2337be61cec49cb89
BLAKE2b-256 00343e012cab5b8f35ea3ac900b11e8d301dde7d0975096b499119a81fe126bb

See more details on using hashes here.

File details

Details for the file jakal_hwpx-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: jakal_hwpx-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for jakal_hwpx-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 09bc6691b152ddcef41a77442d93288b58435ba104988683943810314be8b124
MD5 78087c9b2d89d4e5bae1e6b8015b7d54
BLAKE2b-256 f04fe6da544db2d00f46d732daba0fb5795e0ef7f240e422362899411d741b41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page