Skip to main content

Document processing tool - converts HWP (and more) to Markdown

Project description

docpler

A Python library for converting HWP documents to Markdown.

HWP is a document format used by Hancom Office, the most widely used word processor in South Korea — commonly found in government, legal, and academic documents.

docpler uses a high-performance Rust core to parse HWP 5.0 files and produce clean Markdown output, including tables, equations, and text boxes.

Supported Formats

Format Read Output
HWP 5.0 Markdown

Installation

pip install docpler

Usage

from docpler.hwp import convert

markdown = convert("document.hwp")
print(markdown)

MarkItDown Plugin

pip install markitdown-hwp
from markitdown import MarkItDown

md = MarkItDown(enable_plugins=True)
result = md.convert("document.hwp")
print(result.text_content)

한국어

HWP(한글 워드프로세서) 문서를 Markdown으로 변환하는 Python 패키지입니다. Rust 코어 기반으로 빠르고 정확한 파싱을 제공합니다.

설치

pip install docpler

사용법

from docpler.hwp import convert

markdown = convert("document.hwp")
print(markdown)

markitdown 플러그인

pip install markitdown-hwp
from markitdown import MarkItDown

md = MarkItDown(enable_plugins=True)
result = md.convert("document.hwp")
print(result.text_content)

License

Business Source License 1.1 (BSL 1.1)

  • Free to use for any purpose, including production use
  • Cannot be provided to others as a managed service
  • Converts to Apache License 2.0 on 2031-04-05
  • Rust core engine: distributed as compiled binary, source code is private

HWP Format Notice

This product was developed with reference to the HWP document file (.hwp) specification published by Hancom.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

docpler-1.0.0-cp312-cp312-win_amd64.whl (165.2 kB view details)

Uploaded CPython 3.12Windows x86-64

docpler-1.0.0-cp312-cp312-macosx_11_0_arm64.whl (226.6 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

docpler-1.0.0-cp312-cp312-macosx_10_12_x86_64.whl (236.9 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

docpler-1.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (244.2 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file docpler-1.0.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: docpler-1.0.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 165.2 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for docpler-1.0.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 059da51d06a7692d9149544fd8eabb6165b37a7849807902b39bcaec1034e39f
MD5 4f2b775d088b5a393c99eea1d3a190bf
BLAKE2b-256 47db32c499413ebbbfa341945e012747234aacab80a89244a3347b920d57d37f

See more details on using hashes here.

File details

Details for the file docpler-1.0.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for docpler-1.0.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0f03a9d23a9795d9b9db9eded8370ad6265f5a70259a3ba2cf13cb4e96ea6c07
MD5 40625f43e9db28f3b77f2d111c24d3e4
BLAKE2b-256 ae239257f18fa26a4a2b98a94c711ce4bc8537bab04b55cb0b94f130ab3ee87c

See more details on using hashes here.

File details

Details for the file docpler-1.0.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for docpler-1.0.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ac2be78cbf6cb7967ca5eec9610ab0c2ff30c2b71f88be1087f41114b666e73f
MD5 030e19b2dd6eab34467558dcd17cda68
BLAKE2b-256 bb125b249906762d7e0eedf4cbfc661ec89315a4a3a028126536f5ccf9770965

See more details on using hashes here.

File details

Details for the file docpler-1.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for docpler-1.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8022d83f881cb38a8b1a602f90a958a8d93b84840875113fec8ce2d4b289e804
MD5 a257d3aa84dd8945802c05de3f90d218
BLAKE2b-256 cd502a9bdfcd3f54db36e9cea7afce97ca87257c00c4b40bc71fabfd981d6b89

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page