parse all contents of a docx file with python-docx
Project description
Parse all contents of a docx file with python-docx
Installation
python3 -m pip install docx-parser
Features:
paragraph
: text paragraph, with style_idmultipart
: paragraph with image or hyperlinktable
: table data with merged_cells
Examples
- CMD
docx_parser --help
# parse image as file
docx_parser tests/demo.docx -D tests/media -o tests/out.file.jl
# parse image as base64 string
docx_parser tests/demo.docx -A base64 -o tests/out.base64.jl
- Python
from docx_parser import DocumentParser
infile = 'tests/demo.docx'
doc = DocumentParser(infile)
for _type, item in doc.parse():
print(_type, item)
ToDo
- parse text style: color, bgcolor, font, bold, italic ...
- parse paragraph format
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
docx_parser-1.0.3.tar.gz
(5.2 kB
view details)
Built Distribution
File details
Details for the file docx_parser-1.0.3.tar.gz
.
File metadata
- Download URL: docx_parser-1.0.3.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
e56648fa8fff5be35be20b7752bcf783d9f7698c187172f0a3dd1fa880d8ad8a
|
|
MD5 |
dd2c2e5af94c4ed2ca40581284633261
|
|
BLAKE2b-256 |
7773e541c94b31321d103d231068fb550930e70a07e500449233462c403373ac
|
File details
Details for the file docx_parser-1.0.3-py3-none-any.whl
.
File metadata
- Download URL: docx_parser-1.0.3-py3-none-any.whl
- Upload date:
- Size: 5.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
b64d4aa7b3d649036118f6843fbc4f946a687cd6df083d0f7348c5448e81a956
|
|
MD5 |
3073920bbfb38f57d4db24c97697585b
|
|
BLAKE2b-256 |
326bb342d3ae152b9ddc7d52e314dca4534656293f1bc8d56758b51c10d3a0f0
|