Skip to main content

Streamlined Scanned Image Table Extraction and Excel Conversion

Project description

TabulaScan

TabulaScan: Streamlined Scanned Image Table Extraction and Excel Conversion

Project Overview :

TabulaScan is a cutting-edge solution designed to automate the process of table detection, recognition, and extraction from scanned images, transforming them into Excel files with remarkable accuracy and efficiency. With TabulaScan, you can swiftly transform paper-based tables into structured, editable Excel files, enabling seamless integration into your data management processes.

Key Features :

Precise Table Identification : Our algorithm can precisely locate tables within scanned images, even in cases with complex layouts and diverse fonts.

Robust Image Quality Handling : It's capable of handling varying image quality levels, ensuring reliable performance across different scanned documents.

Data Extraction : Beyond table detection, this algorithm excels at extracting data from these tables, making it a comprehensive tool for data analysis

Output to Excel : Convert recognized tables into Excel files, preserving data structure and format.

Installation

To install TabulaScan, you can use pip:

pip install TabulaScan

Usage

Here’s a simple example demonstrating how to use TabulaScan to convert an image of a table into an Excel file and then read it using Pandas:

import TabulaScan as ts
import cv2
import pandas as pd

# Load your image
img_path = 'path_to_your_image.jpg'
anaylise = cv2.imread(img_path)

# Convert the image table to an Excel table
result = ts.ImgTable2ExcelTable(anaylise)

# Load the resulting Excel file into a Pandas DataFrame
excel_table = pd.read_excel(result)

# Display the extracted table (the table is auto downloaded)
excel_table

Contributing

Contributions are welcome! Please feel free to submit a Pull Request on GitHub.

Support

If you encounter any issues or have any questions, please feel free to open an issue on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

TabulaScan-0.2.0.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

TabulaScan-0.2.0-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file TabulaScan-0.2.0.tar.gz.

File metadata

  • Download URL: TabulaScan-0.2.0.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for TabulaScan-0.2.0.tar.gz
Algorithm Hash digest
SHA256 5f2de2b5979390618d8a6d844bdfe4fe118f267c94c356a718fd68b872971145
MD5 875888effed86b35ac5e5fa10e102262
BLAKE2b-256 e4459a48141942ac31457d82e95a86d54eae30193882de16549cbc6cebdff7cb

See more details on using hashes here.

File details

Details for the file TabulaScan-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: TabulaScan-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for TabulaScan-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 be4941813dd77e66de58268541af1235753f525be195c135c5b224fc0724d269
MD5 f05cade217374cde5307f8b1807fc162
BLAKE2b-256 6fec05261ee5cb647535a8c4eff1898877336f2ffe34c8340329680650d07c87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page