Skip to main content

llama-index readers nougat_ocr integration

Project description

Nougat OCR loader

pip install llama-index-readers-nougat-ocr

This loader reads the equations, symbols, and tables included in the PDF.

Users can input the path of the academic PDF document file which they want to parse. This OCR understands LaTeX math and tables.

Usage

Here's an example usage of the PDFNougatOCR.

from llama_index.readers.nougat_ocr import PDFNougatOCR

reader = PDFNougatOCR()

pdf_path = Path("/path/to/pdf")

documents = reader.load_data(pdf_path)

Miscellaneous

An output folder will be created with the same name as the pdf and .mmd extension.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_nougat_ocr-0.5.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_readers_nougat_ocr-0.5.0.tar.gz.

File metadata

  • Download URL: llama_index_readers_nougat_ocr-0.5.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_nougat_ocr-0.5.0.tar.gz
Algorithm Hash digest
SHA256 75da0fac662a790abf08994f84c6d7a481d883e896f09a4261649751b63ef2ce
MD5 b7392fdf74845086df3ffb761a943b3c
BLAKE2b-256 f25ebf7e83f3686f589c66824450d25c2d79b4c8e8db35e880c38c0341270f5b

See more details on using hashes here.

File details

Details for the file llama_index_readers_nougat_ocr-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_readers_nougat_ocr-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_nougat_ocr-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1ed2e3c5c8a42578f5162d776afbdfd528b9131dfef0be5ad670b0c1e36b6ca4
MD5 44468c1941a1f87bbef643456cfb65c4
BLAKE2b-256 3c61ddee0e303c10605a87b110f9cdb9e276a78820996263917959649ea71a55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page