Skip to main content

llama-index readers nougat_ocr integration

Project description

Nougat OCR loader

pip install llama-index-readers-nougat-ocr

This loader reads the equations, symbols, and tables included in the PDF.

Users can input the path of the academic PDF document file which they want to parse. This OCR understands LaTeX math and tables.

Usage

Here's an example usage of the PDFNougatOCR.

from llama_index.readers.nougat_ocr import PDFNougatOCR

reader = PDFNougatOCR()

pdf_path = Path("/path/to/pdf")

documents = reader.load_data(pdf_path)

Miscellaneous

An output folder will be created with the same name as the pdf and .mmd extension.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_nougat_ocr-0.4.1.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_readers_nougat_ocr-0.4.1.tar.gz.

File metadata

File hashes

Hashes for llama_index_readers_nougat_ocr-0.4.1.tar.gz
Algorithm Hash digest
SHA256 ccad059053d7d3c3f0ce3671ad973b6ebb1d06af3c78b9e4d97b6d54f9365175
MD5 73d5ce8d29c61ff0e543d825ffe054bd
BLAKE2b-256 3fe314306a128db68ade20beb00c74e8dd2d956abba218e48c1c99c69a06e278

See more details on using hashes here.

File details

Details for the file llama_index_readers_nougat_ocr-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_readers_nougat_ocr-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d0f389a0833f716548852d98db2d66a17d4cd9c40cea9e22d22708cfdf56e9c9
MD5 10ab2491c6a19df575c8dc7b06636ef6
BLAKE2b-256 3a2c9dfd11c26410b90940262caf2ea58742d7b080ed03f7800fe34df36ac745

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page