Skip to main content

This package contains the AI models used by the Docling PDF conversion package

Project description

PyPI version PyPI - Python Version uv Code style: black Imports: isort pre-commit Models on Hugging Face License MIT

Docling IBM models

AI modules to support the Docling PDF document conversion project.

  • TableFormer is an AI module that recognizes the structure of a table and the bounding boxes of the table content.
  • Layout model is an AI model that provides among other things ability to detect tables on the page. This package contains inference code for Layout model.

Install

The package provides two variants which allow to seemlessly switch between opencv-python and opencv-python-headless.

# Option 1: with opencv-python-headless
pip install "docling-ibm-models[opencv-python-headless]"

# Option 2: with opencv-python
pip install "docling-ibm-models[opencv-python]"

Pipeline Overview

Architecture

Datasets

Below we list datasets used with their description, source, and "TableFormer Format". The TableFormer Format is our processed version of the version of the original format to work with the dataloader out of the box, and to augment the dataset when necassary to add missing groundtruth (bounding boxes for empty cells).

Name Description URL
PubTabNet PubTabNet contains heterogeneous tables in both image and HTML format, 516k+ tables in the PubMed Central Open Access Subset PubTabNet
FinTabNet A dataset for Financial Report Tables with corresponding ground truth location and structure. 112k+ tables included. FinTabNet
TableBank TableBank is a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet, contains 417K high-quality labeled tables. TableBank

Models

TableModel04:

TableModel04 TableModel04rs (OTSL) is our SOTA method that using transformers in order to predict table structure and bounding box.

Configuration file

Example configuration can be found inside test tests/test_tf_predictor.py These are the main sections of the configuration file:

  • dataset: The directory for prepared data and the parameters used during the data loading.
  • model: The type, name and hyperparameters of the model. Also the directory to save/load the trained checkpoint files.
  • train: Parameters for the training of the model.
  • predict: Parameters for the evaluation of the model.
  • dataset_wordmap: Very important part that contains token maps.

Model weights

You can download the model weights and config files from the links:

Inference Tests

You can run the inference tests for the models with:

python -m pytest tests/

This will also generate prediction and matching visualizations that can be found here: tests\test_data\viz\

Visualization outlines:

  • Light Pink: border of recognized table
  • Grey: OCR cells
  • Green: prediction bboxes
  • Red: OCR cells matched with prediction
  • Blue: Post processed, match
  • Bold Blue: column header
  • Bold Magenta: row header
  • Bold Brown: section row (if table have one)

Demo

A demo application allows to apply the LayoutPredictor on a directory <input_dir> that contains png images and visualize the predictions inside another directory <viz_dir>.

First download the model weights (see above), then run:

python -m demo.demo_layout_predictor -i <input_dir> -v <viz_dir>

e.g.

python -m demo.demo_layout_predictor -i tests/test_data/samples -v viz/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docling_ibm_models-3.13.2.tar.gz (98.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docling_ibm_models-3.13.2-py3-none-any.whl (94.0 kB view details)

Uploaded Python 3

File details

Details for the file docling_ibm_models-3.13.2.tar.gz.

File metadata

  • Download URL: docling_ibm_models-3.13.2.tar.gz
  • Upload date:
  • Size: 98.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docling_ibm_models-3.13.2.tar.gz
Algorithm Hash digest
SHA256 195e02dd119df34d2ce5f76ac614da82825851013e4898db7b0468cdf8740a3d
MD5 4696e529fde0fd263fcb312bc531305d
BLAKE2b-256 c12584166f5751d7837612138966669019a4ef67c09bf6d3ef8d3cc1aa0e6268

See more details on using hashes here.

Provenance

The following attestation bundles were made for docling_ibm_models-3.13.2.tar.gz:

Publisher: pypi.yml on docling-project/docling-ibm-models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file docling_ibm_models-3.13.2-py3-none-any.whl.

File metadata

File hashes

Hashes for docling_ibm_models-3.13.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5fa0838bf15a4e06d2fcb686d756a6f4c329ea0a8820d085f06d07abe96269ed
MD5 70dbc5207d7d9b68882c0c0f12718d30
BLAKE2b-256 a4fc584f75ca31aa6694fed5338ecb54dc4c8341704b1e5b7b6a4528651f12fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for docling_ibm_models-3.13.2-py3-none-any.whl:

Publisher: pypi.yml on docling-project/docling-ibm-models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page