Skip to main content

Document processing tool - converts HWP (and more) to Markdown

Project description

docpler

A Python library for converting HWP documents to Markdown.

HWP is a document format used by Hancom Office, the most widely used word processor in South Korea — commonly found in government, legal, and academic documents.

docpler uses a high-performance Rust core to parse HWP 5.0 files and produce clean Markdown output, including tables, equations, and text boxes.

Supported Formats

Format Read Output
HWP 5.0 Markdown

Installation

pip install docpler

Usage

from docpler.hwp import convert

markdown = convert("document.hwp")
print(markdown)

MarkItDown Plugin

pip install markitdown-hwp
from markitdown import MarkItDown

md = MarkItDown(enable_plugins=True)
result = md.convert("document.hwp")
print(result.text_content)

한국어

HWP(한글 워드프로세서) 문서를 Markdown으로 변환하는 Python 패키지입니다. Rust 코어 기반으로 빠르고 정확한 파싱을 제공합니다.

설치

pip install docpler

사용법

from docpler.hwp import convert

markdown = convert("document.hwp")
print(markdown)

markitdown 플러그인

pip install markitdown-hwp
from markitdown import MarkItDown

md = MarkItDown(enable_plugins=True)
result = md.convert("document.hwp")
print(result.text_content)

License

Business Source License 1.1 (BSL 1.1)

  • Free to use for any purpose, including production use
  • Cannot be provided to others as a managed service
  • Converts to Apache License 2.0 on 2031-04-05
  • Rust core engine: distributed as compiled binary, source code is private

HWP Format Notice

This product was developed with reference to the HWP document file (.hwp) specification published by Hancom.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

docpler-1.0.2-cp312-cp312-win_amd64.whl (165.3 kB view details)

Uploaded CPython 3.12Windows x86-64

docpler-1.0.2-cp312-cp312-macosx_11_0_arm64.whl (226.6 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

docpler-1.0.2-cp312-cp312-macosx_10_12_x86_64.whl (236.9 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

docpler-1.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (244.2 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file docpler-1.0.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: docpler-1.0.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 165.3 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for docpler-1.0.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 448923f44f84f1e82ae351a8425c3974c88e581897e36ddab856c0b1416c4d24
MD5 f55ad6b0826d120d5546de75a91f9937
BLAKE2b-256 cbcbf47d65481b5a75566a4421c7029c1b91e32abb95bff0cb490a13794e5af6

See more details on using hashes here.

File details

Details for the file docpler-1.0.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for docpler-1.0.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7f74653578df850f646dfb05bb6d17acd6cf65b472f6e664886633e67a6dc729
MD5 6fca6a9156512de689fefab98c7b8532
BLAKE2b-256 2a688f027f12717f2f9cbc6c906f3b53c91d9a1e12c57a8de953cbc81a3a5ff3

See more details on using hashes here.

File details

Details for the file docpler-1.0.2-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for docpler-1.0.2-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f8ce9cc067868ddd5ceca6e4389b7d5d3f1863ddcb5aeb16c61119d5abf83fae
MD5 fe2429d107032e4b86d93939582a785e
BLAKE2b-256 dc1b37f6934af794573912f5a4ec4265e87617ec5b2d4ea81116e1a24f8c4860

See more details on using hashes here.

File details

Details for the file docpler-1.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for docpler-1.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 84f839bc55ea73c336590ebd752213fcaa9a45cfbd2645b1fa2a47c0fcb6b502
MD5 ecb40473c329b0d7106b95082103bd12
BLAKE2b-256 194b71da282753b3bf1b875265e4e2bc439bc40814c5605c6dc4aa824661ccfd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page