Doc-UFCN

Project description

Doc-UFCN

This Python 3 library contains a public implementation of Doc-UFCN, a fully convolutional network presented in the paper Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks. This library has been developed by the original authors from Teklia.

The model is designed to run various Document Layout Analysis (DLA) tasks like the text line detection or page segmentation.

Model schema

This library can be used by anyone that has an already trained Doc-UFCN model and want to easily apply it to document images. With only a few lines of code, the trained model is loaded, applied to an image and the detected objects along with some visualizations are obtained.

Getting started

To use Doc-UFCN in your own scripts, install it using pip:

pip install doc-ufcn

Usage

To apply Doc-UFCN to an image, one need to first add a few imports (optionally, set the logging config to make logs appear on stdout) and to load an image. Note that the image should be in RGB.

import cv2
import logging
import sys
from doc_ufcn.main import DocUFCN

logging.basicConfig(
    format="[%(levelname)s] %(message)s",
    stream=sys.stdout,
    level=logging.INFO
)

image = cv2.cvtColor(cv2.imread(IMAGE_PATH), cv2.COLOR_BGR2RGB)

Then one can initialize and load the trained model with the parameters used during training. The number of classes should include the background that must have been put as the first channel during training. By default, the model is loaded in evaluation mode. To load it in training mode, use mode="train".

nb_of_classes = 2
mean = [0, 0, 0]
std = [1, 1, 1]
input_size = 768
model_path = "trained_model.pth"

model = DocUFCN(nb_of_classes, input_size, 'cpu')
model.load(model_path, mean, std, mode="eval")

To run the inference on a GPU, one can replace cpu by the name of the GPU. In the end, one can run the prediction:

detected_polygons = model.predict(image)

Output

When running inference on an image, the detected objects are returned as in the following example. The objects belonging to a class (except for the background class) are returned as a list containing the confidence score and the polygon coordinates of each object.

{
  1: [
    {
      'confidence': 0.99,
      'polygon': [(490, 140), (490, 1596), (2866, 1598), (2870, 140)]
    }
    ...
  ],
  ...
}

In addition, one can directly retrieve the raw probabilities output by the model using model.predict(image, raw_output=True). A tensor of size (nb_of_classes, height, width) is then returned along with the polygons and can be used for further processing.

Lastly, two visualizations can be returned by the model:

A mask of the detected objects mask_output=True;
An overlap of the detected objects on the input image overlap_output=True.

By default, only the detected polygons are returned, to return the four outputs, one can use:

detected_polygons, probabilities, mask, overlap = model.predict(
    image, raw_output, mask_output, overlap_output
)

Mask of detected objects Overlap with the detected objects

Models

We provide an open-source model for the page detection task. To download the model and load it one can use:

from doc_ufcn import models
from doc_ufcn.main import DocUFCN

model_path, parameters = models.download_model('generic_page_detection')

model = DocUFCN(len(parameters['classes']), parameters['input_size'], 'cpu')
model.load(model_path, parameters['mean'], parameters['std'])

By default, the most recent version of the model will be downloaded. One can also use a specific version using the following line:

model_path, parameters = models.download_model('generic_page_detection', version="0.0.2")

Cite us!

If you want to cite us in one of your works, please use the following citation.

@inproceedings{boillet2020,
    author = {Boillet, Mélodie and Kermorvant, Christopher and Paquet, Thierry},
    title = {{Multiple Document Datasets Pre-training Improves Text Line Detection With
              Deep Neural Networks}},
    booktitle = {2020 25th International Conference on Pattern Recognition (ICPR)},
    year = {2021},
    month = Jan,
    pages = {2134-2141},
    doi = {10.1109/ICPR48806.2021.9412447}
}

License

This library is under the 3-Clause BSD License.

Project details

Release history Release notifications | RSS feed

0.2.0rc6 pre-release

Feb 17, 2026

0.2.0rc5 pre-release

Dec 19, 2024

0.2.0rc4 pre-release

Aug 5, 2024

0.2.0rc3 pre-release

Apr 12, 2024

0.2.0rc2 pre-release

Mar 1, 2024

0.2.0rc1 pre-release

Feb 28, 2024

0.1.9

Nov 13, 2023

0.1.9rc8 pre-release

Nov 13, 2023

0.1.9rc7 pre-release

Nov 7, 2023

0.1.9rc6 pre-release

Aug 21, 2023

0.1.9rc5 pre-release

Aug 21, 2023

0.1.9rc4 pre-release

Apr 13, 2023

0.1.9rc3 pre-release

Apr 13, 2023

0.1.9rc2 pre-release

Feb 24, 2023

0.1.9rc1 pre-release

Feb 16, 2023

0.1.8

Jan 18, 2023

0.1.8rc5 pre-release

Jan 16, 2023

0.1.8rc4 pre-release

Nov 30, 2022

0.1.8rc3 pre-release

Nov 29, 2022

0.1.8rc2 pre-release

Nov 14, 2022

0.1.8rc1 pre-release

Aug 26, 2022

0.1.7

Jul 4, 2022

0.1.5

Apr 1, 2022

This version

0.1.4

Jan 26, 2022

0.1.3

Dec 1, 2021

0.1.2

Nov 12, 2021

0.1.1

Nov 10, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doc-ufcn-0.1.4.tar.gz (16.0 kB view details)

Uploaded Jan 26, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

doc_ufcn-0.1.4-py3-none-any.whl (17.6 kB view details)

Uploaded Jan 26, 2022 Python 3

File details

Details for the file doc-ufcn-0.1.4.tar.gz.

File metadata

Download URL: doc-ufcn-0.1.4.tar.gz
Upload date: Jan 26, 2022
Size: 16.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for doc-ufcn-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`f6d0ec252b48467552b70672da27bf96f1ffd5c8d1fd5437a2f1bcc2e878a878`
MD5	`a712d7e09c3fd879024f9177e61d5b48`
BLAKE2b-256	`b6585e1c6febd072cbcb428fff657ee913a287d30b6e278d6cf1aa9f9378a87f`

See more details on using hashes here.

File details

Details for the file doc_ufcn-0.1.4-py3-none-any.whl.

File metadata

Download URL: doc_ufcn-0.1.4-py3-none-any.whl
Upload date: Jan 26, 2022
Size: 17.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for doc_ufcn-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1e584f57927c8f796c97c66ef261d6441c4cab16763eb4f4877b3ac26fbc385d`
MD5	`b406e9c5cab4d3160ef69f5caae0d2e9`
BLAKE2b-256	`61d1d8b299d8d24b6fb834fa7aeeb1dd25f60be778e730c7e26569a085e82298`

See more details on using hashes here.

doc-ufcn 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Doc-UFCN

Getting started

Usage

Output

Models

Cite us!

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes