Skip to main content

Native Python bindings for OfficeMD document extraction

Project description

officemd

Fast Office document extraction for LLMs and agents. Converts DOCX, XLSX, CSV, PPTX, and PDF into clean markdown, structured JSON IR, and Docling output.

Install

uv add officemd
# or
pip install officemd

For the CLI without adding to a project:

uvx officemd markdown report.docx

CLI

officemd markdown report.docx
officemd markdown budget.xlsx --sheets "Summary,Q1"
officemd render report.docx
officemd diff old.docx new.docx

SDK

from pathlib import Path
from officemd import extract_ir_json, markdown_from_bytes, docling_from_bytes

content = Path("report.docx").read_bytes()

# Markdown
print(markdown_from_bytes(content, format="docx"))

# Structured JSON IR
print(extract_ir_json(content, format="docx"))

# Docling JSON
print(docling_from_bytes(content, format="docx"))

Supported Formats

Format Extension Markdown JSON IR Docling
Word .docx yes yes yes
Excel .xlsx yes yes yes
CSV .csv yes yes -
PowerPoint .pptx yes yes yes
PDF .pdf yes yes -

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

officemd-0.1.3.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

officemd-0.1.3-cp312-abi3-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.12+Windows x86-64

officemd-0.1.3-cp312-abi3-manylinux_2_34_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.34+ x86-64

officemd-0.1.3-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.2 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ ARM64

officemd-0.1.3-cp312-abi3-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.12+macOS 11.0+ ARM64

officemd-0.1.3-cp312-abi3-macosx_10_12_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.12+macOS 10.12+ x86-64

File details

Details for the file officemd-0.1.3.tar.gz.

File metadata

  • Download URL: officemd-0.1.3.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for officemd-0.1.3.tar.gz
Algorithm Hash digest
SHA256 3c71e2cbd7e07e23c14c090603c98b312767a7150669024522df05790966e5cf
MD5 1f368100d363deb34f80d09d82b677f8
BLAKE2b-256 e31a4d63ccf438020a40f0446951d9b3e7594d1e2f7134b9f0fc6259b2ccc553

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.3.tar.gz:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.3-cp312-abi3-win_amd64.whl.

File metadata

  • Download URL: officemd-0.1.3-cp312-abi3-win_amd64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.12+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for officemd-0.1.3-cp312-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 e977d0e2aad6fa2ac5bcc2d09b6d2c6dee022b51d0edb51bc0e6dcd1ec8befe1
MD5 80b3c6d272caa8d7ae4c07aa7143d078
BLAKE2b-256 e67b48387ccb1e72f07b5aa13fd5ef7ed54a4a9bfc80a9ed0dab42494fc534ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.3-cp312-abi3-win_amd64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.3-cp312-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for officemd-0.1.3-cp312-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 2f19a7280ac2bcb953fa75ca09fb5367a4405daee05a201d052fe39054533279
MD5 2fc8e1c57f6a97c84192ca076bc1e12a
BLAKE2b-256 b411fbe4088d21ae34acaf61dce5845af3c8b7596167b666ad1311d28f9f3533

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.3-cp312-abi3-manylinux_2_34_x86_64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.3-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for officemd-0.1.3-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 44dfbd0a4ef06d6059758039c26d01a2d3416d10b53f99d72cf8e3228588d5e4
MD5 8cc96c315e7df147e2110f1a68a89965
BLAKE2b-256 7f1dfe1cc89627c40aa8418a81552931005244c350d3bbc32d38df33aca6d96e

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.3-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.3-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for officemd-0.1.3-cp312-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a4af13fff25fc52c6ed912c500f3388f4b102a56bbf0788e0b83f6255d5c9142
MD5 cdf4e4fffac9ec0b57841043465e4709
BLAKE2b-256 0141d3a39a7cf1b36f73f8a6c67bc55c341e9bde73e6d4d90ed4af99c1ddb63c

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.3-cp312-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.3-cp312-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for officemd-0.1.3-cp312-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 2b6aadd84ad5173d1dffb21eb3882a147f4bec68198a47b01723205220236ec2
MD5 9765b701ab7d929df4185fe17a9d5cc7
BLAKE2b-256 39bd2754a86a388df642cb05789a7a2cd951003aa37081fb202e6bd3bc433acc

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.3-cp312-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page