Get the parsed microsoft word document in a hierarchical tree structure.
Project description
mswordtree
Parse your whole word document in a hierarchical tree structure. The document content will be listed down as Heading and its children as subheading/paragraph/table etc.
Install the library using following comand
pip install mswordtree
Use the following code to parse your word document in a tree structure
from mswordtree import GetWordDocTree
root = GetWordDocTree('test.docx')
Now you can iterate over all objects of the document by using the following code
for item in root.Items:
print('Type: {} -> Content {}\n'.format(item.Type, item.Content))
To make the json use the following code
from mswordtree import ToString
ToString([root])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mswordtree-0.1.1.3.tar.gz
(8.5 kB
view hashes)
Built Distribution
Close
Hashes for mswordtree-0.1.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c294cfc5c41d116237f74c2e93452677dd15d5ded2e9a501b4c9a71964fcbc7 |
|
MD5 | 795471cc86ff74ee2fefd1ee83f14346 |
|
BLAKE2b-256 | c19fa93283bea06a9c10226071f8b1b80a94b6428bddc9e9cf56388a70f0b800 |