llama-index packs multidoc_autoretrieval integration
Project description
Multi-Document AutoRetrieval (with Weaviate) Pack
This LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.
CLI Usage
You can download llamapacks directly using llamaindex-cli
, which comes installed with the llama-index
python package:
llamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack
You can then inspect the files at ./multidoc_autoretrieval_pack
and use them as a template for your own project!
Code Usage
You can download the pack to a the ./multidoc_autoretrieval_pack
directory:
from llama_index.core.llama_pack import download_llama_pack
# download and install dependencies
MultiDocAutoRetrieverPack = download_llama_pack(
"MultiDocAutoRetrieverPack", "./multidoc_autoretrieval_pack"
)
From here, you can use the pack. To initialize it, you need to define a few arguments, see below.
Then, you can set up the pack like so:
# setup pack arguments
from llama_index.core.vector_stores.types import MetadataInfo, VectorStoreInfo
import weaviate
# cloud
auth_config = weaviate.AuthApiKey(api_key="<api_key>")
client = weaviate.Client(
"https://<cluster>.weaviate.network",
auth_client_secret=auth_config,
)
vector_store_info = VectorStoreInfo(
content_info="Github Issues",
metadata_info=[
MetadataInfo(
name="state",
description="Whether the issue is `open` or `closed`",
type="string",
),
...,
],
)
# metadata_nodes is set of nodes with metadata representing each document
# docs is the source docs
# metadata_nodes and docs must be the same length
metadata_nodes = [TextNode(..., metadata={...}), ...]
docs = [Document(...), ...]
pack = MultiDocAutoRetrieverPack(
client,
"<metadata_index_name>",
"<doc_chunks_index_name>",
metadata_nodes,
docs,
vector_store_info,
auto_retriever_kwargs={
# any kwargs for the auto-retriever
...
},
)
The run()
function is a light wrapper around query_engine.query()
.
response = pack.run("Tell me a bout a Music celebritiy.")
You can also use modules individually.
# use the retreiver
retriever = pack.retriever
nodes = retriever.retrieve("query_str")
# use the query engine
query_engine = pack.query_engine
response = query_engine.query("query_str")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for llama_index_packs_multidoc_autoretrieval-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 60766af4bac121f0fee766af9110d924a3b7845b49ca8db2dc5c9561db28163c |
|
MD5 | af603c1cb2154d9b40e8931c486ddfc9 |
|
BLAKE2b-256 | 82dafa7893798b38550ed4ec18dc888426bcfe1f9aa35a6448adddfa40d49e01 |
Hashes for llama_index_packs_multidoc_autoretrieval-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f1d3e41edd679aaa26bec380be8c6d979b41de5f16bdb9f065b80c3900a639c |
|
MD5 | 024f0750d9425f3e1257d46b6fcf3351 |
|
BLAKE2b-256 | 12244ef8aa42813ccb23bcd7c84e827161cdc9304b586be5d6f4e0175ae950e0 |