Skip to main content

Native Python bindings for OfficeMD document extraction

Project description

officemd

Fast Office document extraction for LLMs and agents. Converts DOCX, XLSX, CSV, PPTX, and PDF into clean markdown, structured JSON IR, and Docling output.

Install

uv add officemd
# or
pip install officemd

For the CLI without adding to a project:

uvx officemd markdown report.docx

CLI

officemd markdown report.docx
officemd markdown budget.xlsx --sheets "Summary,Q1"
officemd render report.docx
officemd diff old.docx new.docx

SDK

from pathlib import Path
from officemd import extract_ir_json, markdown_from_bytes, docling_from_bytes

content = Path("report.docx").read_bytes()

# Markdown
print(markdown_from_bytes(content, format="docx"))

# Structured JSON IR
print(extract_ir_json(content, format="docx"))

# Docling JSON
print(docling_from_bytes(content, format="docx"))

Supported Formats

Format Extension Markdown JSON IR Docling
Word .docx yes yes yes
Excel .xlsx yes yes yes
CSV .csv yes yes -
PowerPoint .pptx yes yes yes
PDF .pdf yes yes -

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

officemd-0.1.1.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

officemd-0.1.1-cp312-abi3-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.12+Windows x86-64

officemd-0.1.1-cp312-abi3-manylinux_2_34_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.34+ x86-64

officemd-0.1.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.1 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ ARM64

officemd-0.1.1-cp312-abi3-macosx_11_0_arm64.whl (2.0 MB view details)

Uploaded CPython 3.12+macOS 11.0+ ARM64

officemd-0.1.1-cp312-abi3-macosx_10_12_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.12+macOS 10.12+ x86-64

File details

Details for the file officemd-0.1.1.tar.gz.

File metadata

  • Download URL: officemd-0.1.1.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for officemd-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c8547ca8a23b697df7f2c0f2d25f0be771e35905c9cb6c44fe558ecc93090781
MD5 f124365ec0f2e0661f291745a32a1b7b
BLAKE2b-256 3618f775ea2e836239cbc1f5066cf46b68117aa08a7a59aef816cb53a9c7b5e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.1.tar.gz:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.1-cp312-abi3-win_amd64.whl.

File metadata

  • Download URL: officemd-0.1.1-cp312-abi3-win_amd64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.12+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for officemd-0.1.1-cp312-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 08140ca13ae67646626ae0aaf8e0104e8cabcf59683732f9c6bbe5b712b92dc2
MD5 dadd3c069d435479cb57496029a23d2b
BLAKE2b-256 8e7b02dfc9d2676430e92ee45e97f93d77ae5c56e7ab1f537123e39fcc46c5b8

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.1-cp312-abi3-win_amd64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.1-cp312-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for officemd-0.1.1-cp312-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 48964e02de04137a91abdc28b95fcd1282bf06872ca49c25a31bc2867c2c9c92
MD5 0f197e623ff2917482460a90c7513c46
BLAKE2b-256 5f96e03dc96288890a41176d4558e191230ddc463a57edc263ac91ba2b187c9d

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.1-cp312-abi3-manylinux_2_34_x86_64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for officemd-0.1.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7bcf1ac2e80566c9a215b7d1e89053d863c92e87602c08c0cfa2f67bae45422e
MD5 950a35bdc87d4ca7d660a066da62d3f3
BLAKE2b-256 090dbfcdc66eec4fa7d0d4ed5cbfb5bda3f129e2e4cadc1f00b32fe35657a25f

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.1-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.1-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for officemd-0.1.1-cp312-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8b21a2075a74931b9a478a658cf775cc508c37a0a00d3e41a9ee721c844704a3
MD5 3f53b758e501802c9afd5feedd3fd925
BLAKE2b-256 b5db3f9c36789aea2dc10c9773330742eb302e96b9cbf9ecfb6e68fdfd4e865a

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.1-cp312-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.1-cp312-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for officemd-0.1.1-cp312-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 287fb5aaf4e30eb486d79b90bdb307735b42473857f05b9833dde3d00fa3e905
MD5 154645bde1f398652086d219089e54a6
BLAKE2b-256 4c978257d1b1d4825f60ab9c03b14bfe237e5fef1a89e35fa7be1d60eb7c345a

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.1-cp312-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page