Skip to main content

Document processing tool - converts HWP (and more) to Markdown

Project description

docpler

A Python library for converting HWP documents to Markdown.

HWP is a document format used by Hancom Office, the most widely used word processor in South Korea — commonly found in government, legal, and academic documents.

docpler uses a high-performance Rust core to parse HWP 5.0 files and produce clean Markdown output, including tables, equations, and text boxes.

Supported Formats

Format Read Output
HWP 5.0 Markdown

Installation

pip install docpler

Usage

from docpler.hwp import convert

markdown = convert("document.hwp")
print(markdown)

MarkItDown Plugin

pip install markitdown-hwp
from markitdown import MarkItDown

md = MarkItDown(enable_plugins=True)
result = md.convert("document.hwp")
print(result.text_content)

한국어

HWP(한글 워드프로세서) 문서를 Markdown으로 변환하는 Python 패키지입니다. Rust 코어 기반으로 빠르고 정확한 파싱을 제공합니다.

설치

pip install docpler

사용법

from docpler.hwp import convert

markdown = convert("document.hwp")
print(markdown)

markitdown 플러그인

pip install markitdown-hwp
from markitdown import MarkItDown

md = MarkItDown(enable_plugins=True)
result = md.convert("document.hwp")
print(result.text_content)

License

Business Source License 1.1 (BSL 1.1)

  • Free to use for any purpose, including production use
  • Cannot be provided to others as a managed service
  • Converts to Apache License 2.0 on 2031-04-05
  • Rust core engine: distributed as compiled binary, source code is private

HWP Format Notice

This product was developed with reference to the HWP document file (.hwp) specification published by Hancom.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

docpler-1.0.3-cp312-cp312-win_amd64.whl (165.3 kB view details)

Uploaded CPython 3.12Windows x86-64

docpler-1.0.3-cp312-cp312-macosx_11_0_arm64.whl (226.6 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

docpler-1.0.3-cp312-cp312-macosx_10_12_x86_64.whl (236.9 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

docpler-1.0.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (244.2 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file docpler-1.0.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: docpler-1.0.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 165.3 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for docpler-1.0.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 86ec4b0b48ef4be9b084fe9d1a99ec09d210146798e770bc92bfb19ba44ec6f6
MD5 9b9381d4363633c8d8e175cd49ec19cf
BLAKE2b-256 51fb9a5dd33f708194f53435175a055e453f127f3e6f298a69626d9e5a9c7281

See more details on using hashes here.

File details

Details for the file docpler-1.0.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for docpler-1.0.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e0c46cbd0720183afe3c96652a1f198d67253252392c7fc2f55dbdb538a6b3b6
MD5 372d6c3471745686b8ae4a6f39506a92
BLAKE2b-256 ddd2322620877cb3270b1c31fe278d192ba19fa4ae7bf6244283af8e89e484fc

See more details on using hashes here.

File details

Details for the file docpler-1.0.3-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for docpler-1.0.3-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 05bdb8df5e801436f8ce9a2964a4be7f1cbbb8e462a49303d7691bb0a46f005a
MD5 44dbeb9045f15664184b8233469604b3
BLAKE2b-256 ae8588a478eab87bffbf5c14fd396b27e45dd84315ca0dfb4560cecee4d1bcdf

See more details on using hashes here.

File details

Details for the file docpler-1.0.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for docpler-1.0.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4ad5bd15a8d808cbcd80c94a51490d0a4dc852197451097bde70a13c89a01b74
MD5 2f90ea5b4ee8e121b42907658208d4d1
BLAKE2b-256 f521930de8240a43da9475a75efb8e49f0955b710cb875cae257675a3d47d913

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page