Skip to main content

Convert HWP and HWPX files to Markdown

Project description

pyhwp2md

PyPI version Python Support License: MIT

Convert HWP (Hangul Word Processor) and HWPX files to Markdown format.

Features

  • 🔄 Convert both HWP (binary) and HWPX (XML) files
  • 📝 Extracts text, paragraphs, and tables
  • 📊 Converts tables to Markdown pipe format
  • 🎯 Simple CLI interface
  • 🐍 Python 3.10+ support

Quick Start

Run without installation (uvx)

# Convert directly without installing
uvx pyhwp2md document.hwp

# Save to file
uvx pyhwp2md document.hwp -s

# Specify output path
uvx pyhwp2md document.hwpx -o output.md

Installation

Using pip

pip install pyhwp2md

Using uv

uv pip install pyhwp2md

From source

git clone https://github.com/pitzcarraldo/pyhwp2md.git
cd pyhwp2md
pip install -e .

Usage

Command Line

# Output to stdout (default)
pyhwp2md document.hwp

# Save to .md file in same directory
pyhwp2md document.hwp -s
pyhwp2md document.hwpx --save

# Specify output path
pyhwp2md document.hwp -o output.md

Python API

from pyhwp2md import convert

# Convert and get markdown string
markdown = convert("document.hwp")
print(markdown)

# Convert and save to file
markdown = convert("document.hwpx", output_path="output.md")

Supported Formats

Format Extension Description
HWP .hwp Binary format (HWP 5.0+)
HWPX .hwpx XML-based format

Supported Elements

  • ✅ Paragraphs
  • ✅ Headings (H1-H6)
  • ✅ Tables
  • ✅ Lists (bulleted/numbered)
  • ✅ Line breaks
  • ⚠️ Images (coming soon)
  • ⚠️ Links (partial support)

Development

Setup

# Clone repository
git clone https://github.com/pitzcarraldo/pyhwp2md.git
cd pyhwp2md

# Install with dev dependencies
pip install -e .[dev]

Running Tests

# Run tests
pytest

# Run tests with coverage
pytest --cov=pyhwp2md

# Run linter
ruff check src/ tests/

# Run type checker
mypy src/

Dependencies

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Acknowledgments

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhwp2md-0.1.3.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyhwp2md-0.1.3-py3-none-any.whl (13.5 kB view details)

Uploaded Python 3

File details

Details for the file pyhwp2md-0.1.3.tar.gz.

File metadata

  • Download URL: pyhwp2md-0.1.3.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyhwp2md-0.1.3.tar.gz
Algorithm Hash digest
SHA256 26ebe5948635f26dfd08932dcd79501cc739ea2a755eb7f903925a0ec08ad7bf
MD5 9cd6850415c805392c03bbb4d9792ed1
BLAKE2b-256 eef163508720a4eb3efcc7e5816ea1c26e9273d53f60b8cf5ff9180e2c7dfb1c

See more details on using hashes here.

File details

Details for the file pyhwp2md-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pyhwp2md-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 13.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyhwp2md-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 05dcc9c575430ec79a232dd3d9470e821a828af5d27a3a0aab875ae661c218d7
MD5 d5d34a3e82f2bcc96cd13b3996faa854
BLAKE2b-256 df02a527567e8a982d8cb24e47b7e365fc16a1ab3a325651d75411f48b5c5ec8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page