Skip to main content

Convert PDF to structured text using MinerU

Project description

pyconverters_mineru

license tests codecov docs version PyPI - Python Version

Convert PDF to structured text using MinerU

Installation

You can simply pip install pyconverters_mineru.

Developing

Pre-requisites

You will need to install uv (for package management and building):

pip install uv

Clone the repository:

git clone https://github.com/oterrier/pyconverters_mineru

Install dependencies

uv sync --extra test

Running the test suite

uv run pytest

Linting

uv run ruff check .
uv run ruff format --check .

Building the documentation

uv run --extra docs sphinx-build docs docs/_build

The built documentation is available at docs/_build/index.html.

SBOM & vulnerability check

Install the SBOM dependencies:

uv sync --extra sbom

Generate a CycloneDX SBOM from the current environment:

uv run cyclonedx-py environment -o sbom.cdx.json --output-format json

Audit dependencies for known vulnerabilities:

uv run pip-audit --format json --output audit-report.json

To fail on any known vulnerability (useful in CI):

uv run pip-audit --strict

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyconverters_mineru-0.7.4.tar.gz (12.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyconverters_mineru-0.7.4-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file pyconverters_mineru-0.7.4.tar.gz.

File metadata

  • Download URL: pyconverters_mineru-0.7.4.tar.gz
  • Upload date:
  • Size: 12.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.4

File hashes

Hashes for pyconverters_mineru-0.7.4.tar.gz
Algorithm Hash digest
SHA256 a7260e2bec157b9c42ef87bd5eb10464d25d8facc7e4d8e8510ed905418ea756
MD5 782686b8b892cfa8db2a0275fb2778d5
BLAKE2b-256 23b664aa51206097f25372ff353d7535f79c7a80559e62ecfd5341340d3a1f8f

See more details on using hashes here.

File details

Details for the file pyconverters_mineru-0.7.4-py3-none-any.whl.

File metadata

File hashes

Hashes for pyconverters_mineru-0.7.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4a2dfb9899cfce9800832855e8a5bf4fcf8d90f3e87069252f98e86d51f86407
MD5 26e095cd3f038616bde429ab4eff2cee
BLAKE2b-256 d027af23b9b134d942e47e1ee03b00c1b66a00bff52597ce554affaeb3613949

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page