Skip to main content

Add your description here

Project description

Htmlbook to Docx

A very opinionated htmlbook to docx converter, providing formatting that is similar to what a "manuscript" submission docx should look like.

This should work for most "run of the mill" prose documents.

Why Htmlbook?

For one thing, there are asciidoctor templates for it, but beyond that, compared to many "book-like" HTML outputs, it's relatively clean in terms of its HTML markup. This makes processing into docx both simpler and more reliable.

Installation and Usage

This tool can be installed via pip:

pip install htmlbook-docx

Then, to convert a given HTML file, simply run:

htmlbook2docx FILE

Contributing

I used this as a pilot project for trying uv over poetry, so:

  1. Install uv (links to docs)
  2. Clone the repo
  3. Run uv sync, which will create the virtual environment and get the packages going.
  4. Hack away.

Quirks

Some quirks to keep in mind:

  • <code> tags are rendered in small caps; this is an accidental feature of the project I wrote this tool to solve for.
  • There are a bunch of idiosyncratic styles included in the defaults; feel free to use them, but mostly they are for the aforementioned project and can be safely ignored.
  • If you have styles (classes) on your paragraphs that aren't present in apply_manuscript_defaults, the build will fail. Someday, maybe, I'll make that pluggable.
  • While there is handling for definition lists, there is not currently handling for ordered or unordered lists. If these are present, instead of silently failing, the script should log the error to the terminal. This should be true for any "missed content."

To Do

Things yet to do:

  • Handle more or all expected htmlbook tags
  • Better style handling for unexpected classes
  • User-provided styling for various classes (or better: create an "empty" style on-the-fly that the user can then modify inside Word or Libreoffice)
  • Actually write tests. Whoops.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

htmlbook_docx-0.2.0.tar.gz (23.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

htmlbook_docx-0.2.0-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file htmlbook_docx-0.2.0.tar.gz.

File metadata

  • Download URL: htmlbook_docx-0.2.0.tar.gz
  • Upload date:
  • Size: 23.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for htmlbook_docx-0.2.0.tar.gz
Algorithm Hash digest
SHA256 79d6aa79d912047489909b8fc2507a18e61260043fa439c97262d6e8c9c75adc
MD5 18258e670f338ca5d3429fb8cbb18d4e
BLAKE2b-256 6f93b9a8459768c4098c0f4f123944caeea219aa84cb19cfd9f1d809fb9835a3

See more details on using hashes here.

File details

Details for the file htmlbook_docx-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: htmlbook_docx-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for htmlbook_docx-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4a9f393ad6b4e2cf65187e4fef62bded78f55e2915975b19dc47f3a997725988
MD5 f77f1f07ff67181116c563e84fe316fc
BLAKE2b-256 8c97b97765682c78beb425cdcb89a308a34d0075aa7a8fa65490450378c0f189

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page