Add your description here
Project description
Htmlbook to Docx
A very opinionated htmlbook to docx converter, providing formatting that is similar to what a "manuscript" submission docx should look like.
This should work for most "run of the mill" prose documents.
Why Htmlbook?
For one thing, there are asciidoctor templates for it, but beyond that, compared to many "book-like" HTML outputs, it's relatively clean in terms of its HTML markup. This makes processing into docx both simpler and more reliable.
Installation and Usage
This tool can be installed via pip:
pip install htmlbook-docx
Then, to convert a given HTML file, simply run:
htmlbook2docx FILE
Contributing
I used this as a pilot project for trying uv
over poetry, so:
- Install
uv(links to docs) - Clone the repo
- Run
uv sync, which will create the virtual environment and get the packages going. - Hack away.
Quirks
Some quirks to keep in mind:
<code>tags are rendered in small caps; this is an accidental feature of the project I wrote this tool to solve for.- There are a bunch of idiosyncratic styles included in the defaults; feel free to use them, but mostly they are for the aforementioned project and can be safely ignored.
- If you have styles (classes) on your paragraphs that aren't present in
apply_manuscript_defaults, the build will fail. Someday, maybe, I'll make that pluggable. - While there is handling for definition lists, there is not currently handling for ordered or unordered lists. If these are present, instead of silently failing, the script should log the error to the terminal. This should be true for any "missed content."
To Do
Things yet to do:
- Handle more or all expected htmlbook tags
- Better style handling for unexpected classes
- User-provided styling for various classes (or better: create an "empty" style on-the-fly that the user can then modify inside Word or Libreoffice)
- Actually write tests. Whoops.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file htmlbook_docx-0.2.0.tar.gz.
File metadata
- Download URL: htmlbook_docx-0.2.0.tar.gz
- Upload date:
- Size: 23.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79d6aa79d912047489909b8fc2507a18e61260043fa439c97262d6e8c9c75adc
|
|
| MD5 |
18258e670f338ca5d3429fb8cbb18d4e
|
|
| BLAKE2b-256 |
6f93b9a8459768c4098c0f4f123944caeea219aa84cb19cfd9f1d809fb9835a3
|
File details
Details for the file htmlbook_docx-0.2.0-py3-none-any.whl.
File metadata
- Download URL: htmlbook_docx-0.2.0-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a9f393ad6b4e2cf65187e4fef62bded78f55e2915975b19dc47f3a997725988
|
|
| MD5 |
f77f1f07ff67181116c563e84fe316fc
|
|
| BLAKE2b-256 |
8c97b97765682c78beb425cdcb89a308a34d0075aa7a8fa65490450378c0f189
|