Skip to main content

Any2Text - any format to text using Unstructured.io

Project description

any2text-parser

initialize

python3 -m venv ./venv
source venv/bin/activate
pip install -r requirements.txt

test

python3 test_pdf2text.py

usage

from pdf2text.pdf2text import extract_pdf_file_to_text

file_path = "/Users/user/Downloads/AUDIT_MATERIALS/budget_materials/personal/2021/2021 03 remarks 2.pdf"
  
with open(file_path, "rb") as file:
  text_data, text = extract_pdf_file_to_text(
    filename="abc.pdf",
    file=file,
    meta_data_mapping = {
        "document_category": "DEF",
    }
  )
  
  print(text_data, text)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_any2text_parser-1.0.0.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_any2text_parser-1.0.0-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file py_any2text_parser-1.0.0.tar.gz.

File metadata

  • Download URL: py_any2text_parser-1.0.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for py_any2text_parser-1.0.0.tar.gz
Algorithm Hash digest
SHA256 4881498e70d467f965b3f13b73aedcd42e7e2849a301e6cf9cfac74cae2cc53a
MD5 192d6e1fe0f69c62fd3f5fb3bf0f196d
BLAKE2b-256 234e72fc7fe446db1d2602d725160ebf6d626a8a9b115d6f90838db38b3b8389

See more details on using hashes here.

File details

Details for the file py_any2text_parser-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for py_any2text_parser-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1662644e6c3658797966bd3ec0f75b373ae749de62536b9453019a441b929576
MD5 e4b26a08aeb74699770a9a25e9560729
BLAKE2b-256 4fae5c9b92fbf45e7b53ecc5fd0452f52f9edb3874b3a263021296bbc04e0b8a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page