Fast Inference Architecture for MinerU

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Sunnyhaze

These details have not been verified by PyPI

Project description

Flash-MinerU ⚡️📄

简体中文 | English

Accelerating the VLM inference pipeline of MinerU with Ray, turning PDF parsing into a scalable data infrastructure component

Flash-MinerU is a lightweight, low-intrusion acceleration layer for MinerU. Beyond speeding up VLM inference, it upgrades PDF parsing into a high-throughput, distributed data pipeline: a useful building block for modern AI systems.

PDFs are one of the most important high-quality knowledge sources for AI workflows, including papers, reports, and manuals. Converting them into structured, model-ready data such as Markdown and JSON is a foundational step for:

📊 Data governance and curation
🧪 Synthetic data generation pipelines
🧠 LLM / MLLM training and evaluation

Flash-MinerU focuses on making this stage scalable, efficient, and production-ready:

Minimal dependencies, lightweight installation
- One-line install via pip install flash-mineru
- Works in constrained or domestic environments such as METAX
System-level acceleration, not reimplementation
- Fully reuses MinerU’s logic and data structures
- Preserves output consistency
Designed for scale
- Multi-GPU / multi-process / multi-node ready
- Built on Ray as a unified execution layer

✨ Features

🚀 Ray-powered distributed execution
Turns PDF parsing into a scalable data pipeline, from single-node multi-GPU setups to clusters
🧠 High-throughput VLM inference
Focuses on the bottleneck stage and currently defaults to vLLM
🔄 Pipeline-parallel execution (core improvement)
Uses an asynchronous pipeline with cross-stage overlap for sustained high utilization
🧩 Low-intrusion, composable design
Retains MinerU’s middle_json and downstream logic for easy integration

🎯 How pipeline parallelism helps

Flash-MinerU turns MinerU’s sequential pipeline into an asynchronous pipelined system:

🟢 Much higher GPU utilization
Keeps GPUs busy more than 90% of the time, while vanilla MinerU is often around 40-50% because stages block each other
🔄 Cross-stage overlap (key speedup)
Different batches run in different stages at the same time, such as render / VLM / Markdown, instead of waiting for full completion
⚡ Result: much higher throughput
Less idle time plus more overlap leads to significantly faster end-to-end processing

Left — bubble schedule (before)
Batched sequential execution; GPU idle gaps.

Timeline: batched sequential execution with visible GPU idle gaps

Right — pipelined (Flash-MinerU)
Asynchronous pipeline; high utilization.

Timeline: asynchronous pipelined execution with high GPU utilization

📦 Installation

Basic installation (lightweight mode)

Suitable if you have already installed the inference backend manually (e.g., vLLM), or are using an image with a prebuilt environment:

pip install flash-mineru

Install with vLLM backend enabled (optional)

If you want Flash-MinerU to install vLLM as the inference backend for you:

pip install flash-mineru[vllm]

🚀 Quickstart

Minimal Python API example

from flash_mineru import MineruEngine

# Path to PDFs
pdfs = [
    "resnet.pdf",
    "yolo.pdf",
    "text2sql.pdf",
]

engine = MineruEngine(
    model="<path_to_local>/MinerU2.5-2509-1.2B",
    # Model can be downloaded from https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B
    batch_size=16,             # PDFs per logical batch; often choose a multiple of GPU count
    replicas=8,                # Parallel vLLM / model instances; often match GPU count
    num_gpus_per_replica=0.9,  # GPU memory fraction for vLLM KV cache per instance; 1.0 uses full VRAM headroom
    save_dir="outputs_mineru", # Output directory for parsed results
    inflight=4,                # Pipeline depth (v1.0.0 path); can raise on high-memory hosts with diminishing returns
)

# Legacy v0.0.4 sequential batching (deprecated): from flash_mineru import MineruEngineLegacy

results = engine.run(pdfs)
print(results)  # list[list[str]], dir name of the output files

Output structure

Each PDF’s parsing results will be generated under:
```
<save_dir>/<pdf_name>/
```
The Markdown file is located by default at:
```
<save_dir>/<pdf_name>/vlm/<pdf_name>.md
```

📊 Benchmark

Scripts: English · 简体中文

Results (368 PDFs, single-node ~8× A100 class)

Method	Inference configuration	Total time
Flash-MinerU v1.0.0	`MineruEngine`, 8 replicas, `inflight=8`, pipeline parallelism	~8.5 min
MinerU (vanilla)	Hand-spawned pool of 8 `mineru` processes (Benchmark-mineru.py parallel mode, one GPU per process, `vlm-auto-engine`)	~14 min
Flash-MinerU v0.0.4	`MineruEngineLegacy`, 8 replicas × 1 GPU, `batch_size=16`, batch-sequential	~23 min
MinerU (vanilla)	vLLM, single GPU	~65 min

Commands: docs/BENCHMARK.md.

Summary

v1.0.0 is about ~1.7× faster wall time than the eight-process baseline (~8.5 min vs ~14 min)
v0.0.4 (MineruEngineLegacy) is slower than that baseline (~23 min), which highlights what pipeline parallelism adds versus “many full stacks in parallel”
~65 min single-GPU is the same-corpus reference baseline

Experimental setup (expand)

Dataset: 23 paper PDFs (≈9–37 pages each) × 16 copies → 368 files; default folder test/sample_pdfs
Versions: MinerU v2.7.5; Flash-MinerU v0.0.4 = MineruEngineLegacy (sequential stages per batch); v1.0.0 = MineruEngine (pipeline parallelism, default API)
Hardware: single host, 8 × NVIDIA A100

Note: Throughput-focused. Output shape matches MinerU. Upstream does not ship a polished official multi-GPU “one click” path; the eight-process row is our benchmark script sharding eight separate mineru runs.

🗺️ Roadmap

Benchmark scripts & docs — docs/BENCHMARK.md
Support for more inference backends (e.g., sglang)
Service-oriented deployment (HTTP API / task queue)
Sample datasets and more comprehensive documentation

🤝 Acknowledgements

MinerU This project is built upon MinerU’s overall algorithm design and engineering practices, and parallelizes its VLM inference pipeline. The mineru_core/ directory contains code logic copied from and adapted to the MinerU project. We extend our sincere respect and gratitude to the original authors and all contributors of MinerU. 🔗 Official repository / homepage: https://github.com/opendatalab/MinerU
Ray Provides powerful abstractions for distributed and parallel computing, making multi-GPU and multi-process orchestration simpler and more reliable. 🔗 Official website: https://www.ray.io/ 🔗 Official GitHub: https://github.com/ray-project/ray
vLLM Provides a high-throughput, production-ready inference engine (currently the default backend). 🔗 Official website: https://vllm.ai/ 🔗 Official GitHub: https://github.com/vllm-project/vllm

📜 License

AGPL-3.0

Notes: The mineru_core/ directory in this project contains derivative code based on MinerU (AGPL-3.0). In accordance with the AGPL-3.0 license requirements, this repository as a whole is released under AGPL-3.0 as a derivative work. For details, please refer to the root LICENSE file and mineru_core/README.md.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Sunnyhaze

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.1

Apr 14, 2026

1.0.0

Apr 3, 2026

0.0.4

Mar 10, 2026

0.0.2

Feb 12, 2026

0.0.1

Feb 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flash_mineru-1.0.1.tar.gz (333.2 kB view details)

Uploaded Apr 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

flash_mineru-1.0.1-py3-none-any.whl (395.7 kB view details)

Uploaded Apr 14, 2026 Python 3

File details

Details for the file flash_mineru-1.0.1.tar.gz.

File metadata

Download URL: flash_mineru-1.0.1.tar.gz
Upload date: Apr 14, 2026
Size: 333.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for flash_mineru-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`721c39c793ac771310face4369bd81cb3d05d8d4b61434fee9c32fd335c01840`
MD5	`d578c765cd9ef2b8a10fe6d300e66b38`
BLAKE2b-256	`447af332f5b3b6f0f44dfd9310aa64c7eeed4b20f1e4c9d8b91af405b8e1fda0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for flash_mineru-1.0.1.tar.gz:

Publisher: python-publish.yml on OpenDCAI/Flash-MinerU

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: flash_mineru-1.0.1.tar.gz
- Subject digest: 721c39c793ac771310face4369bd81cb3d05d8d4b61434fee9c32fd335c01840
- Sigstore transparency entry: 1294375087
- Sigstore integration time: Apr 14, 2026
Source repository:
- Permalink: OpenDCAI/Flash-MinerU@57f5015ce7834d64b93c2446b2dc3bd594a54482
- Branch / Tag: refs/tags/v1.0.1
- Owner: https://github.com/OpenDCAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@57f5015ce7834d64b93c2446b2dc3bd594a54482
- Trigger Event: release

File details

Details for the file flash_mineru-1.0.1-py3-none-any.whl.

File metadata

Download URL: flash_mineru-1.0.1-py3-none-any.whl
Upload date: Apr 14, 2026
Size: 395.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for flash_mineru-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f2f7fd4e3a7f7502dfb02cf31ba5445cc0c453b617f3c7faa20c6fa6e187dd33`
MD5	`6bc24e01bc92b821995b5578259864d4`
BLAKE2b-256	`141d8848736d8b8326bd8bfdd68e0956faaecd6dce390f0e907f70908d3f8738`

See more details on using hashes here.

Provenance

The following attestation bundles were made for flash_mineru-1.0.1-py3-none-any.whl:

Publisher: python-publish.yml on OpenDCAI/Flash-MinerU

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: flash_mineru-1.0.1-py3-none-any.whl
- Subject digest: f2f7fd4e3a7f7502dfb02cf31ba5445cc0c453b617f3c7faa20c6fa6e187dd33
- Sigstore transparency entry: 1294375166
- Sigstore integration time: Apr 14, 2026
Source repository:
- Permalink: OpenDCAI/Flash-MinerU@57f5015ce7834d64b93c2446b2dc3bd594a54482
- Branch / Tag: refs/tags/v1.0.1
- Owner: https://github.com/OpenDCAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@57f5015ce7834d64b93c2446b2dc3bd594a54482
- Trigger Event: release

flash-mineru 1.0.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Flash-MinerU ⚡️📄

✨ Features

🎯 How pipeline parallelism helps

📦 Installation

Basic installation (lightweight mode)

Install with vLLM backend enabled (optional)

🚀 Quickstart

Minimal Python API example

Output structure

📊 Benchmark

Results (368 PDFs, single-node ~8× A100 class)

Summary

🗺️ Roadmap

🤝 Acknowledgements

📜 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance