Skip to main content

Native Python bindings for OfficeMD document extraction

Project description

officemd

Fast Office document extraction for LLMs and agents. Converts DOCX, XLSX, CSV, PPTX, and PDF into clean markdown, structured JSON IR, and Docling output.

Install

uv add officemd
# or
pip install officemd

For the CLI without adding to a project:

uvx officemd markdown report.docx

CLI

officemd markdown report.docx
officemd markdown budget.xlsx --sheets "Summary,Q1"
officemd render report.docx
officemd diff old.docx new.docx

SDK

from pathlib import Path
from officemd import extract_ir_json, markdown_from_bytes, docling_from_bytes

content = Path("report.docx").read_bytes()

# Markdown
print(markdown_from_bytes(content, format="docx"))

# Structured JSON IR
print(extract_ir_json(content, format="docx"))

# Docling JSON
print(docling_from_bytes(content, format="docx"))

Supported Formats

Format Extension Markdown JSON IR Docling
Word .docx yes yes yes
Excel .xlsx yes yes yes
CSV .csv yes yes -
PowerPoint .pptx yes yes yes
PDF .pdf yes yes -

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

officemd-0.1.5.tar.gz (1.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

officemd-0.1.5-cp312-abi3-win_amd64.whl (2.3 MB view details)

Uploaded CPython 3.12+Windows x86-64

officemd-0.1.5-cp312-abi3-manylinux_2_34_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.34+ x86-64

officemd-0.1.5-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.2 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ ARM64

officemd-0.1.5-cp312-abi3-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.12+macOS 11.0+ ARM64

officemd-0.1.5-cp312-abi3-macosx_10_12_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.12+macOS 10.12+ x86-64

File details

Details for the file officemd-0.1.5.tar.gz.

File metadata

  • Download URL: officemd-0.1.5.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for officemd-0.1.5.tar.gz
Algorithm Hash digest
SHA256 94756fcecfd23e32cfda3b8466a2c50aa65bf791b38347efdd260f8ccb10e268
MD5 106abbea1e9dd986a3770b4b6ccb0a4f
BLAKE2b-256 d6ddd041f8a50581a9bef28402a5fb58614f75439d4e9c78faea024ec8aa8768

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.5.tar.gz:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.5-cp312-abi3-win_amd64.whl.

File metadata

  • Download URL: officemd-0.1.5-cp312-abi3-win_amd64.whl
  • Upload date:
  • Size: 2.3 MB
  • Tags: CPython 3.12+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for officemd-0.1.5-cp312-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 efcf053917a7e67c6fc6337a1d63d4d1eedeebd12984db4fab4a2fcf6026f7e1
MD5 ac6deafb771e563a1b4ebf80a8feefda
BLAKE2b-256 cf80364709b65a893fed0fc5e438f3d6a842e21850a1fa98caaa8a40963ccbbd

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.5-cp312-abi3-win_amd64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.5-cp312-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for officemd-0.1.5-cp312-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 124aed99b791c9c796ff7b84865ae4e176d08406137ca49aa72c7060a7f8367a
MD5 7d3bf99e9423150e57fd5fbe3b585bb2
BLAKE2b-256 25097ebedf0f8dfb349889182abdbb97eb9ea864886188b003eee2f040213dfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.5-cp312-abi3-manylinux_2_34_x86_64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.5-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for officemd-0.1.5-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6b27742df1df37339510db34cc2da69ccf822a4806a8666a514087118379e556
MD5 fc2c6b472677d2c46c8802dcc641e647
BLAKE2b-256 35497df851803fff27ae0d696cc1b905f5e55de8d9ac6d33d95cb184132715fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.5-cp312-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.5-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for officemd-0.1.5-cp312-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 721afe39562c293617fabd930e131136bd412fe51123ebbbe6b4be702c01cfbe
MD5 874e139f6cad036155e9e30e5a6b3c8a
BLAKE2b-256 784b6637118e862d62779c1dbfb9ae88c1445578901662bd1e6120c9acb8bd7f

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.5-cp312-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file officemd-0.1.5-cp312-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for officemd-0.1.5-cp312-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a798d1b7c2d81c627a532eca663792306a3e169e6266362d2007c7fc278a76de
MD5 7da9ffec5aa1a55a59ef7f68b4a5f56d
BLAKE2b-256 fc04157c79abaf932b94b6d5ac8026bafa190578f62c6f92e616887f6404a112

See more details on using hashes here.

Provenance

The following attestation bundles were made for officemd-0.1.5-cp312-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on ThomAub/officemd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page