Practical Python client for MinerU Precision and Agent parsing APIs
Project description
MinerU Python client
A practical wrapper around MinerU's asynchronous APIs, upgraded for production-style usage.
What it handles for you:
- local file upload via MinerU signed URLs
- remote URL submission
- async polling until completion
- Agent Lightweight API and Precision API
- HTML routing to
MinerU-HTMLfor precision mode - optional markdown download for Agent tasks
- precision result zip download + auto-unzip
- easy access to
full.md,full.html,layout.json, content/model JSON paths - callback checksum generation and verification helpers
Installation
pip install mineru-python-client
Or from source:
git clone https://github.com/JimEverest/mineru-python-client.git
cd mineru-python-client
pip install -e .
Files
mineru_client.py— main client implementationrun_mineru_demo.py— CLI-style example runnertests/test_mineru_client.py— unit tests using a fake HTTP session
Quick start
from mineru_client import MinerUClient
client = MinerUClient(token='YOUR_TOKEN', poll_interval=5, timeout=600, request_timeout=60)
result = client.precision_parse_local_files(
['/path/to/document.pdf'],
extra_formats=['html'],
)
print(result[0].full_zip_url)
Production bundle example
This is the easiest production-style path for local files because it:
- uploads
- waits for completion
- downloads the zip
- extracts it
- gives you direct file paths
from mineru_client import MinerUClient
client = MinerUClient(token='YOUR_TOKEN', poll_interval=5, timeout=600)
bundle = client.precision_parse_local_bundle(
'/path/to/document.pdf',
output_dir='./mineru_output',
extra_formats=['html'],
)
print(bundle.zip_path)
print(bundle.extract_dir)
print(bundle.markdown_path)
print(bundle.html_path)
print(bundle.layout_path)
Callback signature verification
from mineru_client import build_callback_checksum, verify_callback_signature
checksum = build_callback_checksum(uid, seed, content)
assert verify_callback_signature(uid, seed, content, checksum)
CLI examples
Precision local file:
MINERU_TOKEN=*** python3 run_mineru_demo.py \
--mode precision-local \
--input '/path/to/document.pdf' \
--poll-interval 5 \
--timeout 600 \
--request-timeout 60 \
--extra-format html
Precision local bundle download + unzip:
MINERU_TOKEN=*** python3 run_mineru_demo.py \
--mode precision-local-bundle \
--input '/path/to/document.pdf' \
--bundle-output-dir './mineru_output' \
--poll-interval 5 \
--timeout 600 \
--extra-format html
Agent local file:
python3 run_mineru_demo.py \
--mode agent-local \
--input '/path/to/small.pdf' \
--download-markdown
API notes
- Precision API requires a token.
- Agent API does not require a token, but is limited to small single files and does not support HTML.
- MinerU parsing is asynchronous; this wrapper uploads/submits first, then polls until
doneorfailed. - Precision local uploads use
/api/v4/file-urls/batcheven for a single file. - Agent local uploads use
/api/v1/agent/parse/fileand then PUT the file to the returned signed URL. - The wrapper validates local files before creating remote tasks.
- The wrapper requires HTTPS for signed upload URLs and result URLs.
- Duplicate local basenames are automatically assigned unique
data_idvalues.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mineru_python_client-0.1.0.tar.gz.
File metadata
- Download URL: mineru_python_client-0.1.0.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72ea8b737b04f899d7810ff2b9968bef65a9b69952549bb34ba96c9ca2b31dba
|
|
| MD5 |
7fc2168bab558d141aa5ce41bccb0d46
|
|
| BLAKE2b-256 |
a7611dfc0cbd75cd128fa6b4e6ad638efbd9809589703c06923f4b77507a69c2
|
File details
Details for the file mineru_python_client-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mineru_python_client-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
164e46588480ec00dfada2190c5b97af26df63a8916e95e5a7c7e0d1ea44a982
|
|
| MD5 |
c85f886341a6cd0ff24e5727d7e4e445
|
|
| BLAKE2b-256 |
a4b5cd35423b8d549f55ca9fc68c2f1ed6e0c9f9ce7d0dedeb46f9b6c197bc9a
|