Skip to main content

PDF Table to JSON Converter

Project description

pdf-table-extract

Extract tables data from pdf files To JSON

  • Locate the table with oepncv and read the contents with a text reader (Your table should be blocked by a border)

  • (If you don't have a border, add a border through adjustment)

  • Currently, only the basic table is supported. (Supports only tables with horizontal headers)

    Header 1 Header 2 Header 3
    cel1 cel2 cel3
    cel1 cel2 cel3
    cel1 cel2 cel3
  • The pdf must be readable by a text reader. Drag on pdf to see if the text is captured

Installation

  • Rquired Python >= 3.8
  • install with pip
pip install pdf-table2json

Example

import

import pdf_table2json.converter as converter

path = "PATH/PDF_NAME.pdf"
result = converter.main(path)
print(result)

CLI

python a.py -i "pdf_path/pdf_name.pdf" -o "output_path/" -j "" -p ""

Colab

[Open In Colab]

License

  • GPL-3.0 license

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_table2json-0.0.10.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

pdf_table2json-0.0.10-py3-none-any.whl (25.6 kB view details)

Uploaded Python 3

File details

Details for the file pdf_table2json-0.0.10.tar.gz.

File metadata

  • Download URL: pdf_table2json-0.0.10.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.17

File hashes

Hashes for pdf_table2json-0.0.10.tar.gz
Algorithm Hash digest
SHA256 5b8e9437a7abf3aa4eb44a4f6e35f2f082954b2297bd43b449a96b9582349d3e
MD5 370eda29b44f53bcfa83b2f20bf19022
BLAKE2b-256 321d72f5441263a84bdc3aeadbd8856865f111cdf5a47d0833b8b18a241927b6

See more details on using hashes here.

File details

Details for the file pdf_table2json-0.0.10-py3-none-any.whl.

File metadata

File hashes

Hashes for pdf_table2json-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 93b5e1cf9e024ae2f1be19c1dd3d9f312124478a1b1d402f035c220ab3837f5b
MD5 c43349aa16ad316e9a9237865ae66cc0
BLAKE2b-256 e293cc9bedd26b3eb0107ccedaf6372decaa95b9401ca86cbb242b78cccbc6a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page