Skip to main content

llama-index readers paddle_ocr integration

Project description

Paddle OCR loader

pip install llama-index-readers-paddle-ocr

This loader reads the equations, symbols, and tables included in the PDF.

Users can input the path of the academic PDF document file which they want to parse. This OCR understands LaTeX math and tables.

Usage

Here's an example usage of the PDFPaddleOCR.

from llama_index.readers.paddle_ocr import PDFPaddleOCR

reader = PDFPaddleOCR()

pdf_path = Path("/path/to/pdf")

documents = reader.load_data(pdf_path)

Miscellaneous

An output folder will be created with the same name as the pdf and .mmd extension.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_paddle_ocr-0.1.0.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_readers_paddle_ocr-0.1.0.tar.gz.

File metadata

File hashes

Hashes for llama_index_readers_paddle_ocr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6be69c2650c085df1b815cfddcb0bee0add2adbc84505e9c3c959d92d79a57e2
MD5 9fe3bf146c581f546fd2111f8baa3775
BLAKE2b-256 ec56482c21843c7411b374f547f581df3ca8dc3ab6a43ccfe4daeea803646ae6

See more details on using hashes here.

File details

Details for the file llama_index_readers_paddle_ocr-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_readers_paddle_ocr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8b50a75d779e767d8992e59a3965f46d62b20334483a547c9e0d21dc38757795
MD5 dbd58b3b84b790d267ce040a51442aed
BLAKE2b-256 6201618ec444f1332d136b84fffbc42586a2b8f1f6f494b3f36469eed7d248d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page