Skip to main content

A package that enables extraction of text, images, and tables.

Project description

#LexStruct_PDF

This is an efficient python library build to extract the contents(text,images,tables) from a pdf.

It accepts one argument i.e. path of your pdf.

How to use:-

from LexStruct_PDF import ContentExtractor

obj = ContentExtractor("pdf to your path") extracted_text = obj.extract_content()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lexstruct_pdf-0.0.1.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

lexstruct_pdf-0.0.1-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file lexstruct_pdf-0.0.1.tar.gz.

File metadata

  • Download URL: lexstruct_pdf-0.0.1.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for lexstruct_pdf-0.0.1.tar.gz
Algorithm Hash digest
SHA256 f4f05ae9c1e6047ad04f75dafea69d9c538eac5a75d037a7699d35263f75e2d6
MD5 6d2e1ff4fa39941be388ebd688a3adae
BLAKE2b-256 27d87d2b9fae9c4416e457771486c65726710136fcaeaece68fea70694a7d6bc

See more details on using hashes here.

File details

Details for the file lexstruct_pdf-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for lexstruct_pdf-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 664a3c8fbfa32a50241dfb3c85cf6835d4ae93f9e6160eca1a443737066289f1
MD5 e80898b27fbe24f5dcc522d09a9bf75d
BLAKE2b-256 294e7a8159990afa357e4fdb6f67ee9a75080c1238617380f1209bdc125590e3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page