Skip to main content

A table reconstruction package

Project description

Table Reconstruction

table-reconstruction is a tool used to detect table spaces and reconstruct the information in them using DL models.

To provide the above feature, Table reconstruction works based on several components as follows:

  • A table detection model is developed based on Yolov5
  • A line segmentation model is built based on Unet
  • Additional modules are used in the information extraction process, especially a directed graph is used to extract information related to the merged cells.

Before start

Due to the requirements of the used libraries, table-reconstruction requires version 3.7 or higher.

Currently, this package works well with most popular operating systems including Windows, Linux/GNU and MacOS. its system requirements will be mainly based on the requirements of Pytorch version 1.9.1, please check more here

Note that although not exactly measured, the processing of this library uses a RAM amount of about 235.9 MiB (for the example provided here) when using the CPU device and about 1000MiB VRAM when used with GPU. In general, the amount of resources used is still quite large and they will be gradually reduced by optimizing the models used in the next versions.

Finally, because it does not require too much computing power, this library is only too demanding on CPU when most devices can use this package without any problems. The processing time with measured in the example provided above has a value of 13.4 s . wall time

Installation

Table Reconstruction is published on PyPI and can be installed from there:

pip install table-reconstruction

You can also install this package manually with the following command:

python setup.py install

Basic usage

You can easily use this library by using the following statements:

import torch
from table_reconstruction import TableExtraction

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
extraction = TableExtraction(device=device)


image = ... # Accept Numpy ndarray and PIL image
tables = extraction.extract(image)

We also provide a simple Jupyter notebook which can be used to illustrate the results obtained after processing, please check it out here

Documentation

Documentation will be available soon.

Get in touch

  • Report bugs, suggest features or view the source code on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

table_reconstruction-0.0.4.tar.gz (1.6 MB view hashes)

Uploaded Source

Built Distribution

table_reconstruction-0.0.4-py3-none-any.whl (41.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page