Skip to main content

Extract tabular data from images and scanned PDFs

Project description

image

image image image

Overview

ExtractTable - API to extract tabular data from images and scanned PDFs

The motivation is to make it easy for developers to extract tabular data from images or scanned PDF files without worrying about the table area, column coordinates, rotation et al.

Prerequisite

Before we talk/boast about the service, a developer MUST need an API key to use the ExtractTable service. FREE credits here - check data privacy in FAQ.

Installation

pip install -U ExtractTable

Basic Usage

Ok, enough selling. Let the ease in coding do the talk, and the output encourages you to buy credits - put that timer on and count the LOC.

from ExtractTable import *
et_sess = ExtractTable(api_key=YOUR_API_KEY)        # Replace your VALID API Key here
print(et_sess.check_usage())        # Checks the API Key validity as well as shows associated plan usage 
table_data = et_sess.process_file(filepath=Location_of_Image_with_Tables, output_format="df")

# To process PDF, make use of pages ("1", "1,3-4", "all") params in the read_pdf function
table_data = et_sess.process_file(filepath=Location_of_PDF_with_Tables, output_format="df", pages="all")

Detail Code Here

Woahh, as simple as that ?!

Certainly. Do you know the current ExtractTable users use it on

  • Bank Statement
  • Medical Records
  • Invoice Details
  • Tax forms

Its up to you now to explore the ways.

Explore

Whatelse is in the store.

  • ExtractTable._OUTPUT - check the list of available output formats
  • et_sess.ServerResponse.json() - check the latest Actual ServerResponse attached to the session

Pull Requests & Rewards

Pull requests are most welcome and greatly appreciated with API credits.

License

This project is licensed under the Apache License 2.0, see the LICENSE file for details.

Social Media

Follow us on Social media for library updates and free credits.

Image      Image

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ExtractTable-1.2.1.2.tar.gz (8.2 kB view hashes)

Uploaded Source

Built Distribution

ExtractTable-1.2.1.2-py3-none-any.whl (14.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page