Skip to main content

A table reconstruction package

Project description

Table Reconstruction

table-reconstruction is a tool used to detect table spaces and reconstruct the information in them using DL models.

To provide the above feature, Table reconstruction works based on several components as follows:

  • A table detection model is developed based on Yolov5
  • A line segmentation model is built based on Unet
  • Additional modules are used in the information extraction process, especially a directed graph is used to extract information related to the merged cells.

Before start

Due to the requirements of the used libraries, table-reconstruction requires version 3.7 or higher.

Currently, this package works well with most popular operating systems including Windows, Linux/GNU and MacOS. its system requirements will be mainly based on the requirements of Pytorch version 1.9.1, please check more here

Note that although not exactly measured, the processing of this library uses a RAM amount of about 235.9 MiB (for the example provided here) when using the CPU device and about 1000MiB VRAM when used with GPU. In general, the amount of resources used is still quite large and they will be gradually reduced by optimizing the models used in the next versions.

Finally, because it does not require too much computing power, this library is only too demanding on CPU when most devices can use this package without any problems. The processing time with measured in the example provided above has a value of 13.4 s . wall time

Installation

Table Reconstruction is published on PyPI and can be installed from there:

pip install table-reconstruction

You can also install this package manually with the following command:

python setup.py install

Basic usage

You can easily use this library by using the following statements:

import torch
from table_reconstruction import TableExtraction

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
extraction = TableExtraction(device=device)


image = ... # Accept Numpy ndarray and PIL image
tables = extraction.extract(image)

We also provide a simple Jupyter notebook which can be used to illustrate the results obtained after processing, please check it out here

Documentation

Documentation will be available soon.

Get in touch

  • Report bugs, suggest features or view the source code on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

table_reconstruction-0.0.4.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

table_reconstruction-0.0.4-py3-none-any.whl (41.9 kB view details)

Uploaded Python 3

File details

Details for the file table_reconstruction-0.0.4.tar.gz.

File metadata

  • Download URL: table_reconstruction-0.0.4.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8

File hashes

Hashes for table_reconstruction-0.0.4.tar.gz
Algorithm Hash digest
SHA256 69e5583d6d8d2f4a3a0e9703da40575ecc25fac4ff1d1e6267ceb5547d1d8822
MD5 aa542e31121e865b18da631c5fcf1bdb
BLAKE2b-256 2c862f221f008c9c7f83c2d664c1f7813de50e90ec478e124070c89164c8fe82

See more details on using hashes here.

File details

Details for the file table_reconstruction-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: table_reconstruction-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 41.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8

File hashes

Hashes for table_reconstruction-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2e1a60bc2ddf50b201f9cbd8cccee63c6879fd7ae36286d4569fb35c7d67f113
MD5 d5230ce261a85469f660d47b6867d936
BLAKE2b-256 0e1bf782c7a7b5cc22f19a197afc3a70ba2a787abfce755656ad87881bc80778

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page