Skip to main content

Convert HWP and HWPX files to Markdown

Project description

pyhwp2md

PyPI version Python Support License: MIT

Convert HWP (Hangul Word Processor) and HWPX files to Markdown format.

Features

  • 🔄 Convert both HWP (binary) and HWPX (XML) files
  • 📝 Extracts text, paragraphs, and tables
  • 📊 Converts tables to Markdown pipe format
  • 🎯 Simple CLI interface
  • 🐍 Python 3.10+ support

Quick Start

Run without installation (uvx)

# Convert directly without installing
uvx pyhwp2md document.hwp

# Save to file
uvx pyhwp2md document.hwp -s

# Specify output path
uvx pyhwp2md document.hwpx -o output.md

Installation

Using pip

pip install pyhwp2md

Using uv

uv pip install pyhwp2md

From source

git clone https://github.com/pitzcarraldo/pyhwp2md.git
cd pyhwp2md
pip install -e .

Usage

Command Line

# Output to stdout (default)
pyhwp2md document.hwp

# Save to .md file in same directory
pyhwp2md document.hwp -s
pyhwp2md document.hwpx --save

# Specify output path
pyhwp2md document.hwp -o output.md

Python API

from pyhwp2md import convert

# Convert and get markdown string
markdown = convert("document.hwp")
print(markdown)

# Convert and save to file
markdown = convert("document.hwpx", output_path="output.md")

Supported Formats

Format Extension Description
HWP .hwp Binary format (HWP 5.0+)
HWPX .hwpx XML-based format

Supported Elements

  • ✅ Paragraphs
  • ✅ Headings (H1-H6)
  • ✅ Tables
  • ✅ Lists (bulleted/numbered)
  • ✅ Line breaks
  • ⚠️ Images (coming soon)
  • ⚠️ Links (partial support)

Development

Setup

# Clone repository
git clone https://github.com/pitzcarraldo/pyhwp2md.git
cd pyhwp2md

# Install with dev dependencies
pip install -e .[dev]

Running Tests

# Run tests
pytest

# Run tests with coverage
pytest --cov=pyhwp2md

# Run linter
ruff check src/ tests/

# Run type checker
mypy src/

Dependencies

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Acknowledgments

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhwp2md-0.1.2.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyhwp2md-0.1.2-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file pyhwp2md-0.1.2.tar.gz.

File metadata

  • Download URL: pyhwp2md-0.1.2.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyhwp2md-0.1.2.tar.gz
Algorithm Hash digest
SHA256 5c38e13596ab4e9e9da7226b6e846b50bb522594c2010d96bdb6f1bf246360f3
MD5 ca507b45404c3cbecb9ed010994c9721
BLAKE2b-256 8d3072216d7a4d6c1e8dfdb4921adb2e1d656b3b5b69758f8857b8a6aba69428

See more details on using hashes here.

File details

Details for the file pyhwp2md-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pyhwp2md-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyhwp2md-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b9921be507bacecfad52ddd2e1c2071eedcf4e1bb5cb0d98ef38165540dbed38
MD5 d821f192d1889cab58001836be769b8f
BLAKE2b-256 3577942543eda20c62ecdf465f4ecebc0cbf7c1eada383a8ef230c0e99217dff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page