Skip to main content

PDF Table to JSON Converter

Project description

pdf-table-extract

Extract tables data from pdf files To JSON

  • Locate the table with oepncv and read the contents with a text reader (Your table should be blocked by a border)

  • (If you don't have a border, add a border through adjustment)

  • Currently, only the basic table is supported. (Supports only tables with horizontal headers)

    Header 1 Header 2 Header 3
    cel1 cel2 cel3
    cel1 cel2 cel3
    cel1 cel2 cel3
  • The pdf must be readable by a text reader. Drag on pdf to see if the text is captured

Installation

  • Rquired Python >= 3.8
  • install with pip
pip install pdf-table2json

Example

import

import pdf_table2json

CLI

python a.py -i "pdf_path/pdf_name.pdf" -o "output_path/"

License

  • GPL-3.0 license

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

pdf_table2json-0.0.6-py3-none-any.whl (24.0 kB view details)

Uploaded Python 3

File details

Details for the file pdf_table2json-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: pdf_table2json-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 24.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.17

File hashes

Hashes for pdf_table2json-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 c6a4c0c8acef0a434a27f7311708069e3312ce1e846faceba42254670efe49c8
MD5 33d18cf5259b1a26079b853be54de8db
BLAKE2b-256 7409a99c1a6c04e28d9686e4b247d7498352aba6187355551e09890c68eb1062

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page