This package contains the AI models used by the Docling PDF conversion package
Project description
Docling IBM models
AI modules to support the Docling PDF document conversion project.
- TableFormer is an AI module that recognizes the structure of a table and the bounding boxes of the table content.
- Layout model is an AI model that provides among other things ability to detect tables on the page. This package contains inference code for Layout model.
Install
The package provides two variants which allow to seemlessly switch between opencv-python and opencv-python-headless.
# Option 1: with opencv-python-headless
pip install "docling-ibm-models[opencv-python-headless]"
# Option 2: with opencv-python
pip install "docling-ibm-models[opencv-python]"
Pipeline Overview
Datasets
Below we list datasets used with their description, source, and "TableFormer Format". The TableFormer Format is our processed version of the version of the original format to work with the dataloader out of the box, and to augment the dataset when necassary to add missing groundtruth (bounding boxes for empty cells).
| Name | Description | URL |
|---|---|---|
| PubTabNet | PubTabNet contains heterogeneous tables in both image and HTML format, 516k+ tables in the PubMed Central Open Access Subset | PubTabNet |
| FinTabNet | A dataset for Financial Report Tables with corresponding ground truth location and structure. 112k+ tables included. | FinTabNet |
| TableBank | TableBank is a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet, contains 417K high-quality labeled tables. | TableBank |
Models
TableModel04:
TableModel04rs (OTSL) is our SOTA method that using transformers in order to predict table structure and bounding box.
Configuration file
Example configuration can be found inside test tests/test_tf_predictor.py
These are the main sections of the configuration file:
dataset: The directory for prepared data and the parameters used during the data loading.model: The type, name and hyperparameters of the model. Also the directory to save/load the trained checkpoint files.train: Parameters for the training of the model.predict: Parameters for the evaluation of the model.dataset_wordmap: Very important part that contains token maps.
Model weights
You can download the model weights and config files from the links:
Inference Tests
You can run the inference tests for the models with:
python -m pytest tests/
This will also generate prediction and matching visualizations that can be found here:
tests\test_data\viz\
Visualization outlines:
Light Pink: border of recognized tableGrey: OCR cellsGreen: prediction bboxesRed: OCR cells matched with predictionBlue: Post processed, matchBold Blue: column headerBold Magenta: row headerBold Brown: section row (if table have one)
Demo
A demo application allows to apply the LayoutPredictor on a directory <input_dir> that contains
png images and visualize the predictions inside another directory <viz_dir>.
First download the model weights (see above), then run:
python -m demo.demo_layout_predictor -i <input_dir> -v <viz_dir>
e.g.
python -m demo.demo_layout_predictor -i tests/test_data/samples -v viz/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docling_ibm_models-3.13.0.tar.gz.
File metadata
- Download URL: docling_ibm_models-3.13.0.tar.gz
- Upload date:
- Size: 98.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f402effae8a63b0e5c3b5ce13120601baa2cd8098beef1d53ab5a056443758d3
|
|
| MD5 |
5b6dcd992f9ec14dfcd22b01d51379cf
|
|
| BLAKE2b-256 |
618701bf0c710af37328aa3517b34e64c2a2f3a6283a1cfc8859ae05881dd769
|
Provenance
The following attestation bundles were made for docling_ibm_models-3.13.0.tar.gz:
Publisher:
pypi.yml on docling-project/docling-ibm-models
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docling_ibm_models-3.13.0.tar.gz -
Subject digest:
f402effae8a63b0e5c3b5ce13120601baa2cd8098beef1d53ab5a056443758d3 - Sigstore transparency entry: 1188969180
- Sigstore integration time:
-
Permalink:
docling-project/docling-ibm-models@73cf24d321f74f77de5f974e6c048da0e1512a3d -
Branch / Tag:
refs/tags/v3.13.0 - Owner: https://github.com/docling-project
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@73cf24d321f74f77de5f974e6c048da0e1512a3d -
Trigger Event:
release
-
Statement type:
File details
Details for the file docling_ibm_models-3.13.0-py3-none-any.whl.
File metadata
- Download URL: docling_ibm_models-3.13.0-py3-none-any.whl
- Upload date:
- Size: 93.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a11acc6034b06e0bed8dc0ca1fa700615b8246eacce411619168e1f6562b0d0d
|
|
| MD5 |
37366c40b82f2b4e8ed3556cd601f1c9
|
|
| BLAKE2b-256 |
255211a8c8fff80e1fa581173edcc91cc92ed24184519e746fe39456f617653d
|
Provenance
The following attestation bundles were made for docling_ibm_models-3.13.0-py3-none-any.whl:
Publisher:
pypi.yml on docling-project/docling-ibm-models
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docling_ibm_models-3.13.0-py3-none-any.whl -
Subject digest:
a11acc6034b06e0bed8dc0ca1fa700615b8246eacce411619168e1f6562b0d0d - Sigstore transparency entry: 1188969183
- Sigstore integration time:
-
Permalink:
docling-project/docling-ibm-models@73cf24d321f74f77de5f974e6c048da0e1512a3d -
Branch / Tag:
refs/tags/v3.13.0 - Owner: https://github.com/docling-project
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@73cf24d321f74f77de5f974e6c048da0e1512a3d -
Trigger Event:
release
-
Statement type: