Skip to main content

llama-index readers paddle_ocr integration

Project description

Paddle OCR loader

pip install llama-index-readers-paddle-ocr

This loader reads the equations, symbols, and tables included in the PDF.

Users can input the path of the academic PDF document file which they want to parse. This OCR understands LaTeX math and tables.

Usage

Here's an example usage of the PDFPaddleOCR.

from llama_index.readers.paddle_ocr import PDFPaddleOCR

reader = PDFPaddleOCR()

pdf_path = Path("/path/to/pdf")

documents = reader.load_data(pdf_path)

Miscellaneous

An output folder will be created with the same name as the pdf and .mmd extension.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_paddle_ocr-0.2.0.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_readers_paddle_ocr-0.2.0.tar.gz.

File metadata

  • Download URL: llama_index_readers_paddle_ocr-0.2.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_paddle_ocr-0.2.0.tar.gz
Algorithm Hash digest
SHA256 5410c418e2d34d9cca2526d8420fdcb4c6d0b3d48920db03002ed9733de6b0b4
MD5 8682e5f943db2cf7e428fcb8c1fe6059
BLAKE2b-256 9a8d44758e9edc7f1ad75d4aeeb99af64d4de827d4f05011fe109f68047dda9b

See more details on using hashes here.

File details

Details for the file llama_index_readers_paddle_ocr-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_readers_paddle_ocr-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_paddle_ocr-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e5d7638a3cac431a0370daa40540ce449e70423b0c340edf9368c1900fe749cd
MD5 b7f46f47c009c4a32c00eeca1d41a147
BLAKE2b-256 8ea25149ae36954394c9b8b81deb2137c5ad943aa0abea5f87ece98b46754309

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page