Skip to main content

Convert HWP and HWPX files to Markdown

Project description

pyhwp2md

PyPI version Python Support License: MIT

Convert HWP (Hangul Word Processor) and HWPX files to Markdown format.

Features

  • 🔄 Convert both HWP (binary) and HWPX (XML) files
  • 📝 Extracts text, paragraphs, and tables
  • 📊 Converts tables to Markdown pipe format
  • 🎯 Simple CLI interface
  • 🐍 Python 3.10+ support

Quick Start

Run without installation (uvx)

# Convert directly without installing
uvx pyhwp2md document.hwp

# Save to file
uvx pyhwp2md document.hwp -s

# Specify output path
uvx pyhwp2md document.hwpx -o output.md

Installation

Using pip

pip install pyhwp2md

Using uv

uv pip install pyhwp2md

From source

git clone https://github.com/pitzcarraldo/pyhwp2md.git
cd pyhwp2md
pip install -e .

Usage

Command Line

# Output to stdout (default)
pyhwp2md document.hwp

# Save to .md file in same directory
pyhwp2md document.hwp -s
pyhwp2md document.hwpx --save

# Specify output path
pyhwp2md document.hwp -o output.md

Python API

from pyhwp2md import convert

# Convert and get markdown string
markdown = convert("document.hwp")
print(markdown)

# Convert and save to file
markdown = convert("document.hwpx", output_path="output.md")

Supported Formats

Format Extension Description
HWP .hwp Binary format (HWP 5.0+)
HWPX .hwpx XML-based format

Supported Elements

  • ✅ Paragraphs
  • ✅ Headings (H1-H6)
  • ✅ Tables
  • ✅ Lists (bulleted/numbered)
  • ✅ Line breaks
  • ⚠️ Images (coming soon)
  • ⚠️ Links (partial support)

Development

Setup

# Clone repository
git clone https://github.com/pitzcarraldo/pyhwp2md.git
cd pyhwp2md

# Install with dev dependencies
pip install -e .[dev]

Running Tests

# Run tests
pytest

# Run tests with coverage
pytest --cov=pyhwp2md

# Run linter
ruff check src/ tests/

# Run type checker
mypy src/

Dependencies

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Acknowledgments

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhwp2md-0.1.1.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyhwp2md-0.1.1-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file pyhwp2md-0.1.1.tar.gz.

File metadata

  • Download URL: pyhwp2md-0.1.1.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyhwp2md-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6582e7640afd909333912bdcbd558d86fb91dd39286f5a3c854afd32c6ed95c6
MD5 31a487927e1c715d9265e3a963ec6521
BLAKE2b-256 b04c4eacc638b6609c4aec9a4e80126672dd9d251a9197890ab5b6c07737368b

See more details on using hashes here.

File details

Details for the file pyhwp2md-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pyhwp2md-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyhwp2md-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 761c9c956e2fe7f0c513423a51dcd06716a51662eabbcf5889136ed510ca2a3d
MD5 c088a63ba29aee0ed65703181c0a805e
BLAKE2b-256 e2dc42735e18da2923330717397e18d5e6d6f1b29d38f07f86995fee4064605d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page