docling-ocr-onnxtr

Onnx Text Recognition (OnnxTR) OCR plugin for docling

These details have not been verified by PyPI

Project links

Project description

PyPI - Downloads

The docling-OCR-OnnxTR repository provides a plugin that integrates the OnnxTR OCR engine into the Docling framework, enhancing document processing capabilities with efficient and accurate text recognition.

Key Features:

Seamless Integration: Easily incorporate OnnxTR's OCR functionalities into your Docling workflows for improved document parsing and analysis.
Optimized Performance: Leverages OnnxTR's lightweight architecture to deliver faster inference times and reduced resource consumption compared to traditional OCR engines.
Flexible Deployment: Supports various hardware configurations, including CPU, GPU, and OpenVINO, allowing you to choose the best setup for your needs.

Installation:

To install the plugin, use one of the following commands based on your hardware:

For GPU support please take a look at: ONNX Runtime.

Prerequisites: CUDA & cuDNN needs to be installed before Version table.

# For CPU
pip install "docling-ocr-onnxtr[cpu]"
# For Nvidia GPU
pip install "docling-ocr-onnxtr[gpu]"
# For Intel GPU / Integrated Graphics
pip install "docling-ocr-onnxtr[openvino]"

# Headless mode (no GUI)
# For CPU
pip install "docling-ocr-onnxtr[cpu-headless]"
# For Nvidia GPU
pip install "docling-ocr-onnxtr[gpu-headless]"
# For Intel GPU / Integrated Graphics
pip install "docling-ocr-onnxtr[openvino-headless]"

By integrating OnnxTR with Docling, users can achieve more efficient and accurate OCR results, enhancing the overall document processing experience.

Usage

from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import (
    ConversionResult,
    DocumentConverter,
    InputFormat,
    PdfFormatOption,
)
from docling_ocr_onnxtr import OnnxtrOcrOptions


def main():
    # Source document to convert
    source = "https://arxiv.org/pdf/2408.09869v4"

    # Available detection & recognition models can be found at
    # https://github.com/felixdittrich92/OnnxTR

    # Or you choose a model from Hugging Face Hub
    # Collection: https://huggingface.co/collections/Felix92/onnxtr-66bf213a9f88f7346c90e842

    ocr_options = OnnxtrOcrOptions(
        # Text detection model
        det_arch="db_mobilenet_v3_large",
        # Text recognition model - from Hugging Face Hub
        reco_arch="Felix92/onnxtr-parseq-multilingual-v1",
        # This can be set to `True` to auto-correct the orientation of the pages
        auto_correct_orientation=False,
    )

    pipeline_options = PdfPipelineOptions(
        ocr_options=ocr_options,
    )
    pipeline_options.allow_external_plugins = True  # <-- enabled the external plugins

    # Convert the document
    converter = DocumentConverter(
        format_options={
            InputFormat.PDF: PdfFormatOption(
                pipeline_options=pipeline_options,
            ),
        },
    )

    conversion_result: ConversionResult = converter.convert(source=source)
    doc = conversion_result.document
    md = doc.export_to_markdown()
    print(md)


if __name__ == "__main__":
    main()

It is also possible to load the models from local files instead of using the Hugging Face Hub or downloading them from the repo:

from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import (
    ConversionResult,
    DocumentConverter,
    InputFormat,
    PdfFormatOption,
)
from docling_ocr_onnxtr import OnnxtrOcrOptions
from onnxtr.models import db_mobilenet_v3_large, parseq


def main():
    # Source document to convert
    source = "https://arxiv.org/pdf/2408.09869v4"

    # Load models from local files
    # NOTE: You need to download the models first and then adjust the paths accordingly.
    det_model = db_mobilenet_v3_large("/home/felix/.cache/onnxtr/models/db_mobilenet_v3_large-1866973f.onnx")
    reco_model = parseq("/home/felix/.cache/onnxtr/models/parseq-00b40714.onnx")

    ocr_options = OnnxtrOcrOptions(
        # Text detection model
        det_arch=det_model,
        # Text recognition model
        reco_arch=reco_model,
        # This can be set to `True` to auto-correct the orientation of the pages
        auto_correct_orientation=False,
    )

    pipeline_options = PdfPipelineOptions(
        ocr_options=ocr_options,
    )
    pipeline_options.allow_external_plugins = True  # <-- enabled the external plugins

    # Convert the document
    converter = DocumentConverter(
        format_options={
            InputFormat.PDF: PdfFormatOption(
                pipeline_options=pipeline_options,
            ),
        },
    )

    conversion_result: ConversionResult = converter.convert(source=source)
    doc = conversion_result.document
    md = doc.export_to_markdown()
    print(md)


if __name__ == "__main__":
    main()

Configuration

The configuration of the OCR engine is done via the OnnxtrOcrOptions class. The following options are available:

lang: List of languages to use for OCR. Default is ["en", "fr"].
confidence_score: Word confidence threshold for the recognition model. Default is 0.5.
objectness_score: Detection model objectness score threshold. Default is 0.3.
det_arch: Detection model architecture. Default is "fast_base".
reco_arch: Recognition model architecture. Default is "crnn_vgg16_bn".
reco_bs: Batch size for the recognition model. Default is 512.
auto_correct_orientation: Whether to auto-correct the orientation of the pages. Default is False.
preserve_aspect_ratio: Whether to preserve the aspect ratio of the images. Default is True.
symmetric_pad: Whether to use symmetric padding. Default is True.
paragraph_break: Paragraph break threshold. Default is 0.035.
load_in_8_bit: Whether to load the model in 8-bit. Default is False. (Not supported for Hugging Face loaded models yet)
providers: List of providers to use for the Onnxruntime. Default is None which means auto-select.
session_options: Session options for the Onnxruntime. Default is None which means default OnnxTR session options.

Available Hugging Face models can be found at Hugging Face.

Further information:

Please take a look at OnnxTR.

Contributing

Contributions are welcome!

Before opening a pull request, please ensure that your code passes the tests and adheres to the project's coding standards.

You can run the tests and checks using:

make style
make quality
make test

License

Distributed under the Apache 2.0 License. See LICENSE for more information.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Feb 5, 2026

This version

0.2.0

Jun 30, 2025

0.1.3

Jun 19, 2025

0.1.2

Apr 9, 2025

0.1.1

Apr 9, 2025

0.1.0

Apr 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docling_ocr_onnxtr-0.2.0.tar.gz (22.7 kB view details)

Uploaded Jun 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

docling_ocr_onnxtr-0.2.0-py3-none-any.whl (17.7 kB view details)

Uploaded Jun 30, 2025 Python 3

File details

Details for the file docling_ocr_onnxtr-0.2.0.tar.gz.

File metadata

Download URL: docling_ocr_onnxtr-0.2.0.tar.gz
Upload date: Jun 30, 2025
Size: 22.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for docling_ocr_onnxtr-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`37cfa3a02e56dd040d9e03572de867a890a76b4683b917cde9d682c1e4aa43cc`
MD5	`7b1ce26c6543f449b9e6139ed23b54f4`
BLAKE2b-256	`828d260f4a07f1026f1b40cfc1a165a2fa044633e4a5e906539c6d0d5cd3d20c`

See more details on using hashes here.

File details

Details for the file docling_ocr_onnxtr-0.2.0-py3-none-any.whl.

File metadata

Download URL: docling_ocr_onnxtr-0.2.0-py3-none-any.whl
Upload date: Jun 30, 2025
Size: 17.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for docling_ocr_onnxtr-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e2cad585d8e0c584fce1d5bddd36e1d7c53994ae127da2cb1568f39fcbff84f`
MD5	`b1e9eb015021b37ddfd21a93c5f862fa`
BLAKE2b-256	`7dbdd9a1ded10bb5563684960458639bb0d6439b7aea0d0e81c57ace974eabab`

See more details on using hashes here.

docling-ocr-onnxtr 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Usage

Configuration

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes