The fork in https://github.com/pqzx/html2docx converts html (including mathml) to docx
Project description
mathml2docx
The fork in mathml2docx converts html (including mathml) to docx
Dependencies: python-docx
& bs4
To install
pip install mathml2docx
Imporvements
- Add: Convert mathml to docx formula
- Add: MathmlToDocx, a wrapper class for InteliSense and encapsulation
- Add: Etc...
- Fix: Known errors in the original repository
Usage
Add strings of html to an existing docx.Document object
from docx import Document
from mathml2docx import MathmlToDocx
document = Document()
new_parser = MathmlToDocx()
# do stuff to document
html = '<h1>Hello world</h1>'
new_parser.add_html_to_document(html, document)
# do more stuff to document
document.save('your_file_name')
Convert files directly
from mathml2docx import MathmlToDocx
new_parser = MathmlToDocx()
new_parser.parse_html_file(input_html_file_path, output_docx_file_path)
Convert files from a string
from mathml2docx import MathmlToDocx
new_parser = MathmlToDocx()
docx = new_parser.parse_html_string(input_html_file_string)
Change table styles
Tables are not styled by default. Use the table_style
attribute on the parser to set a table
style. The style is used for all tables.
from mathml2docx import MathmlToDocx
new_parser = MathmlToDocx()
new_parser.table_style = 'Light Shading Accent 4'
To add borders to tables, use the TableGrid
style:
new_parser.table_style = 'TableGrid'
Default table styles can be found here: https://python-docx.readthedocs.io/en/latest/user/styles-understanding.html#table-styles-in-default-template
Change default paragraph style
No style is applied to the paragraphs by default. Use the paragraph_style
attribute on the parser
to set a default paragraph style. The style is used for all paragraphs. If additional styling (
color, background color, alignment...) is defined in the HTML, it will be applied after the
paragraph style.
from mathml2docx import MathmlToDocx
new_parser = MathmlToDocx()
new_parser.paragraph_style = 'Quote'
Default paragraph styles can be found here: https://python-docx.readthedocs.io/en/latest/user/styles-understanding.html#paragraph-styles-in-default-template
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for mathml2docx-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7614ed9958369f79d1137512def287b6b206956416c93c793852c6925e9318b1 |
|
MD5 | 73dc0acd57f2afced5acb993c8ed8933 |
|
BLAKE2b-256 | 1a68c94da091a35ade9bc0c1164ae9ce35156d0278427fc0486e066a8bdc174e |