Skip to main content

parse all contents of a docx file with python-docx

Project description

PyPI GitHub last commit

Parse all contents of a docx file with python-docx

Installation

python3 -m pip install docx-parser

Features:

  • paragraph: text paragraph, with style_id
  • multipart: paragraph with image or hyperlink
  • table: table data with merged_cells

Examples

  • CMD
docx_parser --help

# parse image as file
docx_parser tests/demo.docx -D tests/media -o tests/out.file.jl

# parse image as base64 string
docx_parser tests/demo.docx -A base64 -o tests/out.base64.jl
  • Python
from docx_parser import DocumentParser

infile = 'tests/demo.docx'
doc = DocumentParser(infile)
for _type, item in doc.parse():
    print(_type, item)

ToDo

  • parse text style: color, bgcolor, font, bold, italic ...
  • parse paragraph format

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docx_parser-1.0.3.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

docx_parser-1.0.3-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file docx_parser-1.0.3.tar.gz.

File metadata

  • Download URL: docx_parser-1.0.3.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for docx_parser-1.0.3.tar.gz
Algorithm Hash digest
SHA256 e56648fa8fff5be35be20b7752bcf783d9f7698c187172f0a3dd1fa880d8ad8a
MD5 dd2c2e5af94c4ed2ca40581284633261
BLAKE2b-256 7773e541c94b31321d103d231068fb550930e70a07e500449233462c403373ac

See more details on using hashes here.

File details

Details for the file docx_parser-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: docx_parser-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for docx_parser-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b64d4aa7b3d649036118f6843fbc4f946a687cd6df083d0f7348c5448e81a956
MD5 3073920bbfb38f57d4db24c97697585b
BLAKE2b-256 326bb342d3ae152b9ddc7d52e314dca4534656293f1bc8d56758b51c10d3a0f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page