Skip to main content

Document utilities for SRX: extract text from PDF, DOCX, PPTX, XLSX

Project description

srx-lib-docs

Small helpers to extract plain text from common office document formats used by SRX services.

What it includes:

  • extract_text(path_or_bytes, mime_type=None) supports PDF, DOCX, PPTX, XLSX

Install

PyPI (public):

  • pip install srx-lib-docs

uv (pyproject):

[project]
dependencies = ["srx-lib-docs>=0.1.0"]

Usage

from srx_lib_docs import extract_text
text = extract_text("/path/to/file.pdf")

Notes

  • For XLSX, the first 20 rows of each sheet are read to keep it lightweight; adjust in code if needed.

License

Proprietary © SRX

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

srx_lib_docs-0.1.3.tar.gz (2.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

srx_lib_docs-0.1.3-py3-none-any.whl (2.7 kB view details)

Uploaded Python 3

File details

Details for the file srx_lib_docs-0.1.3.tar.gz.

File metadata

  • Download URL: srx_lib_docs-0.1.3.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for srx_lib_docs-0.1.3.tar.gz
Algorithm Hash digest
SHA256 7909145f7a4ab62e45f45e1261a5ea7d5a3e873acdac57b1a2d690a32366cf04
MD5 8c4a46895cf5cb8c7bb2583bb4e98ad8
BLAKE2b-256 a4441809fa5c0ed0eed3acb23d3f7be882e14e9c9cfd5b7793824ed0564c5dc2

See more details on using hashes here.

File details

Details for the file srx_lib_docs-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: srx_lib_docs-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 2.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for srx_lib_docs-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 48733d2ef93b907e385b1a2eacbb1485f37988c6d632d687cf146860e869ed9e
MD5 9d8f41caa495f8007fae4f1b9ebe195f
BLAKE2b-256 423dbd26b9588ab1482a9b4eb6a35a2f96e5c2fa3ce4b79dd7483c3207de1910

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page