llama-index packs multidoc_autoretrieval integration
Project description
Multi-Document AutoRetrieval (with Weaviate) Pack
This LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.
CLI Usage
You can download llamapacks directly using llamaindex-cli
, which comes installed with the llama-index
python package:
llamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack
You can then inspect the files at ./multidoc_autoretrieval_pack
and use them as a template for your own project!
Code Usage
You can download the pack to a the ./multidoc_autoretrieval_pack
directory:
from llama_index.core.llama_pack import download_llama_pack
# download and install dependencies
MultiDocAutoRetrieverPack = download_llama_pack(
"MultiDocAutoRetrieverPack", "./multidoc_autoretrieval_pack"
)
From here, you can use the pack. To initialize it, you need to define a few arguments, see below.
Then, you can set up the pack like so:
# setup pack arguments
from llama_index.core.vector_stores.types import MetadataInfo, VectorStoreInfo
import weaviate
# cloud
auth_config = weaviate.AuthApiKey(api_key="<api_key>")
client = weaviate.Client(
"https://<cluster>.weaviate.network",
auth_client_secret=auth_config,
)
vector_store_info = VectorStoreInfo(
content_info="Github Issues",
metadata_info=[
MetadataInfo(
name="state",
description="Whether the issue is `open` or `closed`",
type="string",
),
...,
],
)
# metadata_nodes is set of nodes with metadata representing each document
# docs is the source docs
# metadata_nodes and docs must be the same length
metadata_nodes = [TextNode(..., metadata={...}), ...]
docs = [Document(...), ...]
pack = MultiDocAutoRetrieverPack(
client,
"<metadata_index_name>",
"<doc_chunks_index_name>",
metadata_nodes,
docs,
vector_store_info,
auto_retriever_kwargs={
# any kwargs for the auto-retriever
...
},
)
The run()
function is a light wrapper around query_engine.query()
.
response = pack.run("Tell me a bout a Music celebritiy.")
You can also use modules individually.
# use the retreiver
retriever = pack.retriever
nodes = retriever.retrieve("query_str")
# use the query engine
query_engine = pack.query_engine
response = query_engine.query("query_str")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for llama_index_packs_multidoc_autoretrieval-0.1.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b8ab9e1cd10af994611bbc68629253450282aa62133d1e9f2ce1eb504f6b87c |
|
MD5 | 786f158e9c599df9beb2cc9a41c3777d |
|
BLAKE2b-256 | bd5f81ea376519d5c1a2aa936f6f40062cec5ffc96c70f729585e4e7544d3709 |
Hashes for llama_index_packs_multidoc_autoretrieval-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59bf7a3043207bf36d254c173702706a2aa0a01204214d0b395d2eeea2b126bd |
|
MD5 | 9921905a6fc4d09653e938dbe588836f |
|
BLAKE2b-256 | e85efbf89e417773dcb69924383aca9006c1de0c1947e012cdb8994b6f067e37 |