Convert HWP (Hangul Word Processor) files to HWPX format
Project description
hwp2hwpx
Convert HWP files to HWPX format — the only pip install-able HWP→HWPX converter.
HWP is the legacy binary format used by Hangul (한글), the dominant word processor in South Korea. HWPX is the modern XML-based format (OWPML/ODF-like ZIP archive). This package converts between them programmatically — no Hangul installation or GUI required.
Why?
| Tool | What it does | Limitation |
|---|---|---|
| Hangul GUI | Open HWP → Save As HWPX | Manual, not scriptable |
| HwpxConverter.exe | Bundled with Hangul, GUI only | No CLI, Windows only |
| kordoc | Parses HWP → Markdown/JSON | Extracts content, doesn't convert format |
| hwp2hwpx ← this | Converts HWP → HWPX (valid ZIP/XML) | Needs Java runtime |
If you need to read HWP content → use kordoc. If you need a real HWPX file you can open/edit in Hangul → use this.
Install
pip install hwp2hwpx
Requires Java Runtime (JRE) 8+:
# Windows
winget install EclipseAdoptium.Temurin.21.JDK
# macOS
brew install temurin
# Linux (Debian/Ubuntu)
apt install default-jre
Usage
CLI
# Single file
hwp2hwpx document.hwp
# Multiple files
hwp2hwpx *.hwp
# Output directory
hwp2hwpx document.hwp -o output/
# Recursive folder conversion
hwp2hwpx ./documents/ -r
Python API
from hwp2hwpx import convert, convert_batch
# Single file
output_path = convert("document.hwp")
output_path = convert("document.hwp", "output.hwpx")
# Batch
results = convert_batch(["a.hwp", "b.hwp"], output_dir="output/")
for input_path, output_path, error in results:
if error:
print(f"FAIL: {input_path}: {error}")
else:
print(f"OK: {output_path}")
How it works
Bundles neolord0/hwp2hwpx Java library as a fat JAR:
- hwplib — reads HWP binary (OLE2/CFB compound document)
- hwpxlib — writes HWPX XML (ZIP archive with OWPML structure)
Pure file-format conversion. No Hangul installation, no COM API, no DRM issues.
Korean file paths on Windows are automatically handled via temp-file workaround (JVM encoding issue bypass).
Output format
The output HWPX is a standard ZIP archive containing:
META-INF/container.xml
Contents/header.xml
Contents/section0.xml
Contents/section1.xml
...
Fully compatible with Hangul 2020+ and any OWPML-aware tool.
License
Apache License 2.0
Based on Java libraries by neolord0:
한국어
HWP(한글 워드프로세서) 파일을 HWPX(OWPML) 형식으로 변환하는 Python 패키지.
pip install hwp2hwpx 한 줄로 설치, 바로 사용. 한글 프로그램 설치 불필요.
설치
pip install hwp2hwpx
Java 필요: winget install EclipseAdoptium.Temurin.21.JDK
사용법
hwp2hwpx 문서.hwp
hwp2hwpx *.hwp -o 출력폴더/
from hwp2hwpx import convert
convert("문서.hwp")
kordoc과의 차이
- kordoc: HWP를 읽어서 마크다운/JSON으로 추출 (텍스트 파싱)
- hwp2hwpx: HWP를 HWPX 파일로 변환 (한글에서 열 수 있는 완전한 문서)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hwp2hwpx-1.0.1.tar.gz.
File metadata
- Download URL: hwp2hwpx-1.0.1.tar.gz
- Upload date:
- Size: 2.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e419a1d97b9547bf3c0d365adbc41696990b7fa96b5d5d8d08580875cb719a23
|
|
| MD5 |
db0feedf13f761e28290e96588036338
|
|
| BLAKE2b-256 |
9980fa9aba1c796bfb01db64cf887b1fa03cc180466b813afd4ed4d18cdfc0be
|
File details
Details for the file hwp2hwpx-1.0.1-py3-none-any.whl.
File metadata
- Download URL: hwp2hwpx-1.0.1-py3-none-any.whl
- Upload date:
- Size: 2.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e44606b65d840a33ceae42272194f1ae63665676fa01ea88bd763fede72798ea
|
|
| MD5 |
668e7ad0a00b9c324d81294a430a2e1e
|
|
| BLAKE2b-256 |
43b50e6ff236cc92db5a15afd9b38b34c7bf219ee7648365daab80f31e080dbd
|