Skip to main content

parse all contents of a docx file with python-docx

Project description

PyPI GitHub last commit

Parse all contents of a docx file with python-docx

Installation

python3 -m pip install docx-parser

Features:

  • paragraph: text paragraph, with style_id
  • multipart: paragraph with image or hyperlink
  • table: table data with merged_cells

Examples

  • CMD
docx_parser --help

# parse image as file
docx_parser tests/demo.docx -D tests/media -o tests/out.file.jl

# parse image as base64 string
docx_parser tests/demo.docx -A base64 -o tests/out.base64.jl
  • Python
from docx_parser import DocumentParser

infile = 'tests/demo.docx'
doc = DocumentParser(infile)
for _type, item in doc.parse():
    print(_type, item)

ToDo

  • parse text style: color, bgcolor, font, bold, italic ...
  • parse paragraph format

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docx_parser-1.0.2.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

docx_parser-1.0.2-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file docx_parser-1.0.2.tar.gz.

File metadata

  • Download URL: docx_parser-1.0.2.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.8

File hashes

Hashes for docx_parser-1.0.2.tar.gz
Algorithm Hash digest
SHA256 91a9f63c7e2a34cb5ead8e05979efd685454e16a89b23f1b58167f39662df87a
MD5 c0b8bfac60b51bf32a57ec42af68e64a
BLAKE2b-256 98040838d86d1eee5052e207837d8631fcae00c7d968c990c6406a0720c7c5e6

See more details on using hashes here.

File details

Details for the file docx_parser-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: docx_parser-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.8

File hashes

Hashes for docx_parser-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 21025d28663c7f1f8d3ece755f02b872c3d7814fe59018bef5fd74a6d1cddab4
MD5 59d218692c62d45252541af338c11fce
BLAKE2b-256 069cc954a03fd83928d1e7176e758f47620705100fd832af950b883b738bbe9f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page