Skip to main content

hwp file format parser

Project description

pyhwp

HWP Document Format v5 parser & processor.

Features

  • Analyze and extract internal streams out from a HWP Document Format v5 file

  • (Experimental) Conversion to OpenDocument format (.odt) or plain text (.txt)

Installation

from pypi:

virtualenv pyhwp
pyhwp/bin/pip install --pre pyhwp  # Install pyhwp into a virtualenv directory

Or:

pip install --user --pre pyhwp  # Install pyhwp into user's home directory

Requirements

Documentation & Development

Contributors

Maintainer: mete0r

License

Copyright (C) 2010-2023 mete0r <https://github.com/mete0r>

http://www.gnu.org/graphics/agplv3-155x51.png

GNU Affero General Public License v3.0 (text version)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Disclosure

This program has been developed in accordance with a public document named “HWP Binary Specification 1.1” published by Hancom Inc.

CHANGES

0.1b16 (unreleased)

  • [CVE-2023-0286] Depends on cryptography >= 40.0.1

  • [CVE-2022-2309] Depends on lxml >= 4.9,2

0.1b15 (2020-05-30)

  • Unknown Numbering.Kind value of 6, which is not described in the official specification docs, has been added. See #177.

0.1b14 (2020-05-17)

  • Fix xmldump_flat for Python 3.8

0.1b13 (2020-05-17)

  • Replace docopt with argparse.

  • Workaround for BinData decompression (#175, #176)

0.1b12 (2019-04-08)

  • Add Python 3.x support.

  • Add an optional dependency on colorlog for colorful logging

  • Remove dependency on hypua2jamo, resulting no automatic conversion of Hanyang PUA to Hangul Jamo

0.1b11 (2019-03-21)

  • Remove dependency on PyCrypto. - [CVE-2013-7458], [CVE-2018-6594]

  • Add dependency on cryptography.

0.1b10 (2019-03-21)

  • Drop support for Python 2.5, 2.6.

  • Prefer ‘olefile’ to ‘OleFileIO_PL’.

  • Fix ‘Dutmal’ control attribute names.

  • hwp5html: represent path names in bytes

  • Declare some dependencies with environment markers: olefile, lxml, pycrypto

  • Update dependency on hypua2jamo >= 0.4.4

0.1b9 (2016-02-26)

  • hwp5html: serveral improvements - lang-* classes of span elements and associated css font-family - horizontal page layouts - Single page layout - enhance horizontal positioning of TableControl, GShapeObject

  • distdoc: fix sha1offset (by Hodong Kim)

0.1b8 (2014-11-03)

  • hwp5view: experimental viewer with webkitgtk+

  • hwp5proc: xml –formats (“flat”, “nested”)

  • hwp5proc: models –events (experimental)

  • hwp5proc: models –seqno –format (incompatible changes)

  • hwp5proc: find –from-stdin

  • hwp5proc: find –format

  • binmodels: GShapeObjectCaption

  • olestorage: Gsf implementation through python-gi

  • olestorage: use new olefile instead of OleFileIO_PL

0.1b7 (2014-01-31)

0.1b6 (2014-01-20)

  • binmodel: change type of TableCell dimensions to signed integer

  • hwp5odt: fix NCName for style:name (close #140)

  • hwp5proc: fix with-statement in ‘xml’ command for Python 2.5

  • hwp5proc: mark ‘xml’ command experimental

0.1b5 (2013-10-29)

  • close #134

  • hwp5html generates .xhtml instead of .html

  • hwp5proc: new ‘–no-xml-decl’ option

  • hwp5odt: fix to not use ‘/’ in resulting style names

  • hwp5proc: IdMappings.memoshape only if version > 5.0.1.6

0.1b4 (2013-07-03)

  • hwp5proc records: new option ‘–raw-header’

  • hwp5odt: new ‘–document’ option produces single ODT XML files (*.fodt)

  • hwp5odt: new ‘–styles’, ‘–content’ option produces styles/content XML files

  • ODT XSL files restructured

0.1b3 (2013-06-18)

  • Fix IdMappings (#125)

  • hwp5proc records: new option ‘–raw-payload’

  • hwp5proc xml: FlagsType as xsd:hexBinary

  • Various binary/xml models changes

0.1b2 (2013-06-08)

  • Add PyPy support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhwp2-1.0.0.tar.gz (197.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyhwp2-1.0.0-py3-none-any.whl (286.6 kB view details)

Uploaded Python 3

File details

Details for the file pyhwp2-1.0.0.tar.gz.

File metadata

  • Download URL: pyhwp2-1.0.0.tar.gz
  • Upload date:
  • Size: 197.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for pyhwp2-1.0.0.tar.gz
Algorithm Hash digest
SHA256 598757a161c40886a7cffbe99b150d40f723862f7b381dfc15ac207f55074f2e
MD5 65de972d2f5ba02b08b5502f2211d865
BLAKE2b-256 1f1a286544460d3cca1f31e5a6676d3e6976b969682676d04e785497303697c6

See more details on using hashes here.

File details

Details for the file pyhwp2-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pyhwp2-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 286.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for pyhwp2-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 40671d42d507566d763712945c9c72b14704af4da4e7f95730205270dc4ebe72
MD5 279e0dc286a43201e17ab99a11df1aaa
BLAKE2b-256 3bc22698629096b2b2337b4bf9c31753d0b2350aeaffe3f907d7c37d907ad380

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page