Skip to main content

Python bindings for the Rust doxx document converter

Project description

python-doxx

Minimal Python bindings for the doxx Rust library. Exposes just the pieces needed for scripted document inspection:

  • convert(path, export_format="markdown", page=None, images=False) — export .docx files as Markdown, plain text, JSON, CSV, or ANSI strings. Optional page isolates a single page (1-indexed); set images=True to retain image references when available.
  • search(path, query, page=None) — return match locations for quick content checks, mirroring doxx --search.
  • extract_images(path, output_dir) — dump embedded images, similar to --extract-images in the CLI.

Installation

Once the tagged v0.1.0 release has propagated to PyPI:

pip install python-doxx
# or
uv pip install python-doxx

To build from source locally:

maturin develop

Usage

import doxx

# Export to Markdown
markdown = doxx.convert("report.docx", export_format="markdown", images=True)

# Pull a single page as CSV
csv_page = doxx.convert("data.docx", export_format="csv", page=2)

# Locate text snippets
hits = doxx.search("contract.docx", "payment")
for hit in hits:
    print(hit.page, hit.text)

# Extract embedded images
saved = doxx.extract_images("slides.docx", "./images")

Supported export formats: markdown, text, json, csv, and ansi.

Images are optional; enable them when you need real file paths in the exported Markdown or ANSI render. CSV export raises ValueError if the document contains no tables.

Releasing

  1. Make sure CHANGELOG.md (if present) and pyproject.toml share the new version.
  2. Tag the release (git tag v0.1.0), then git push --tags.
  3. The GitHub Actions workflow builds wheels for macOS, Linux, and Windows, uploads them as artifacts, and—when PYPI_API_TOKEN is configured in the repository secrets—publishes to PyPI automatically.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

python_doxx-0.1.0-cp312-cp312-win_amd64.whl (2.9 MB view details)

Uploaded CPython 3.12Windows x86-64

python_doxx-0.1.0-cp312-cp312-musllinux_1_1_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12musllinux: musl 1.1+ x86-64

python_doxx-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

python_doxx-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (3.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

python_doxx-0.1.0-cp312-cp312-macosx_10_9_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.12macOS 10.9+ x86-64

File details

Details for the file python_doxx-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for python_doxx-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 4390692effbc0c3a78e6bedbaeac4d09730e712869381e38283609b974f94505
MD5 e818571a21ec5492c524fdcdafebbcb7
BLAKE2b-256 357aa8fdce0ccd7624e22aa61bfaad1301d55ce3e79c47fed04f049788260378

See more details on using hashes here.

File details

Details for the file python_doxx-0.1.0-cp312-cp312-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for python_doxx-0.1.0-cp312-cp312-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 bce4977aa4f4d31cc0624e593a0c758ca2a7d2905bdbf37df520baa6a63b07f2
MD5 a60c40359f563c8a47a2fcc178a48f45
BLAKE2b-256 f57901e1a77b78a4c3f9711aba05910b84bcbbd54a847e0e63fba91e22d6a11b

See more details on using hashes here.

File details

Details for the file python_doxx-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for python_doxx-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 86688818a112c843b15ab7039bc145f29bdaabf3dfbd954802cd042e33a64094
MD5 b70e94232906426a39f23aa93d8d07dc
BLAKE2b-256 97b185b941366d0f06023da2146872e2ab0ffbd98a6b11880ab338957cb9c344

See more details on using hashes here.

File details

Details for the file python_doxx-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for python_doxx-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 66f5246895b85829dcdf306b1bdae1e2c3ef6b5e8d03855f3d071330121f3b98
MD5 3f3d77d3f8cac3127c58e2fbd788edd4
BLAKE2b-256 516a46e8f815021e7c6ab57a6d643259b2a45183f08aa008c621add82ca489ca

See more details on using hashes here.

File details

Details for the file python_doxx-0.1.0-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for python_doxx-0.1.0-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c446d33ed64e251f43aebd5b35c5b2e25e5ee9182b98268f4ef46fc002d1ab36
MD5 483c1a6b501104f8829c723f083fe3d6
BLAKE2b-256 619dcab2141b336e738d1a80b43d24a70af2d009bb72bd08cd6e954be3a1c95e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page