Skip to main content

Native Python bindings for OfficeMD document extraction

Project description

officemd

Fast Office document extraction for LLMs and agents. Converts DOCX, XLSX, CSV, PPTX, and PDF into clean markdown, structured JSON IR, and Docling output.

Install

uv add officemd
# or
pip install officemd

For the CLI without adding to a project:

uvx officemd markdown report.docx

CLI

officemd markdown report.docx
officemd markdown budget.xlsx --sheets "Summary,Q1"
officemd render report.docx
officemd diff old.docx new.docx

SDK

from pathlib import Path
from officemd import extract_ir_json, markdown_from_bytes, docling_from_bytes

content = Path("report.docx").read_bytes()

# Markdown
print(markdown_from_bytes(content, format="docx"))

# Structured JSON IR
print(extract_ir_json(content, format="docx"))

# Docling JSON
print(docling_from_bytes(content, format="docx"))

Supported Formats

Format Extension Markdown JSON IR Docling
Word .docx yes yes yes
Excel .xlsx yes yes yes
CSV .csv yes yes -
PowerPoint .pptx yes yes yes
PDF .pdf yes yes -

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

officemd-0.1.2.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

officemd-0.1.2-cp312-abi3-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.12+Windows x86-64

officemd-0.1.2-cp312-abi3-manylinux_2_34_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.34+ x86-64

officemd-0.1.2-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.2 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ ARM64

officemd-0.1.2-cp312-abi3-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.12+macOS 11.0+ ARM64

officemd-0.1.2-cp312-abi3-macosx_10_12_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.12+macOS 10.12+ x86-64

File details

Details for the file officemd-0.1.2.tar.gz.

File metadata

  • Download URL: officemd-0.1.2.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for officemd-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6b2504cb0bde54f8186f7b7bec2d8d3e904de558614cc62873b73c7d38d761ac
MD5 bac74834538f1726f62a907a0892ce43
BLAKE2b-256 a1b113eeb2b39dd68f808e810d1b3485f0eea045df457d398fdc220052847585

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.2.tar.gz:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.2-cp312-abi3-win_amd64.whl.

File metadata

  • Download URL: officemd-0.1.2-cp312-abi3-win_amd64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.12+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for officemd-0.1.2-cp312-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 44e46107ec85ec53fc43b3d0ee14aa4fa62f266d70cbcb3f89cd3b20e459342e
MD5 30b2a0f0388447daaf126ff00aa87049
BLAKE2b-256 344324545a865ac60622ce80cdd0d9939bfcd708a9acbb0b61ede9a2c3d6b54d

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.2-cp312-abi3-win_amd64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.2-cp312-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for officemd-0.1.2-cp312-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 24b2866adc99a457f8c7e7c0561cbb60798efa7693b4f7a4597d95edf8097ea8
MD5 0e0935ec250b77957b606f5ce95866a4
BLAKE2b-256 d53f263a183f45172022e9a0c8a57c26caf27b60929c83810cb823606764ffe3

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.2-cp312-abi3-manylinux_2_34_x86_64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.2-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for officemd-0.1.2-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3a5a4c58192f5ab3160c7423f75ab29a76d65c0710d1b48ff76513abcd39c4f3
MD5 a1a87a98ffac88ccf44571037b0bb0f2
BLAKE2b-256 370843dbaa3096b9da0002bb435e3b1ff94cf721404080e29558a4014f8418f2

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.2-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.2-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for officemd-0.1.2-cp312-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3440426390161b1c6b89f1665f488d9ec8a992b5176d71c02af855d35bc609cb
MD5 34e083a6a0987f1b7649b41d1302283a
BLAKE2b-256 09babc18eb8a9eecc41046c4750f2cf4cd5b4d1f74c049396b5f174908d82d0c

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.2-cp312-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.2-cp312-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for officemd-0.1.2-cp312-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 828e5948aa5e6d3ab3bda71c961d60186f64e1345c9d881e4f93eec245847e05
MD5 7d0fcb8260097dcc57d350be3591b3a6
BLAKE2b-256 26763d967a298609116e0f2d6893cd9489c7acb143c5cbf7ed7ffa7251c84d9b

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.2-cp312-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page