Skip to main content

llama-index readers llama-parse integration

Project description

LlamaParse

LlamaParse is an API created by LlamaIndex to efficiently parse and represent files for efficient retrieval and context augmentation using LlamaIndex frameworks.

LlamaParse directly integrates with LlamaIndex.

Currently available for free. Try it out today!

NOTE: Currently, only PDF files are supported.

Getting Started

First, login and get an api-key from https://cloud.llamaindex.ai.

Then, make sure you have the latest LlamaIndex version installed.

NOTE: If you are upgrading from v0.9.X, we recommend following our migration guide, as well as uninstalling your previous version first.

pip uninstall llama-index  # run this if upgrading from v0.9.x or older
pip install -U llama-index --upgrade --no-cache-dir --force-reinstall

Lastly, install the package:

pip install llama-parse

Now you can run the following to parse your first PDF file:

import nest_asyncio

nest_asyncio.apply()

from llama_parse import LlamaParse

parser = LlamaParse(
    api_key="llx-...",  # can also be set in your env as LLAMA_CLOUD_API_KEY
    result_type="markdown",  # "markdown" and "text" are available
    verbose=True,
)

# sync
documents = parser.load_data("./my_file.pdf")

# sync batch
documents = parser.load_data(["./my_file1.pdf", "./my_file2.pdf"])

# async
documents = await parser.aload_data("./my_file.pdf")

# async batch
documents = await parser.aload_data(["./my_file1.pdf", "./my_file2.pdf"])

Using with SimpleDirectoryReader

You can also integrate the parser as the default PDF loader in SimpleDirectoryReader:

import nest_asyncio

nest_asyncio.apply()

from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader

parser = LlamaParse(
    api_key="llx-...",  # can also be set in your env as LLAMA_CLOUD_API_KEY
    result_type="markdown",  # "markdown" and "text" are available
    verbose=True,
)

file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(
    "./data", file_extractor=file_extractor
).load_data()

Full documentation for SimpleDirectoryReader can be found on the LlamaIndex Documentation.

Examples

Several end-to-end indexing examples can be found in the examples folder

Terms of Service

See the Terms of Service Here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_llama_parse-0.1.3.tar.gz (2.5 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file llama_index_readers_llama_parse-0.1.3.tar.gz.

File metadata

File hashes

Hashes for llama_index_readers_llama_parse-0.1.3.tar.gz
Algorithm Hash digest
SHA256 e0ee0c393e10fc80eac644788338bbd2032050c8b8a474f3d0b5ebd08e9867fe
MD5 188c30bbb7aefba129ea914992e7fff5
BLAKE2b-256 907b22b9f86ca2a8bc58117ed26460d27034cadfa1ebd747b1a4af5395c666cf

See more details on using hashes here.

File details

Details for the file llama_index_readers_llama_parse-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_readers_llama_parse-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f52a06a2765a2ffe6c138cf1703ab1de6249ff069ba62d80b9147e849bbcbc27
MD5 6ad838f8157204b6ed5c1c47fa35d53e
BLAKE2b-256 40054578db57ee5ae53e2bfa111671be27fa80dc936858ce753db917f61a4bd6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page