llama-index packs multidoc_autoretrieval integration
Project description
Multi-Document AutoRetrieval (with Weaviate) Pack
This LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.
CLI Usage
You can download llamapacks directly using llamaindex-cli
, which comes installed with the llama-index
python package:
llamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack
You can then inspect the files at ./multidoc_autoretrieval_pack
and use them as a template for your own project!
Code Usage
You can download the pack to a the ./multidoc_autoretrieval_pack
directory:
from llama_index.core.llama_pack import download_llama_pack
# download and install dependencies
MultiDocAutoRetrieverPack = download_llama_pack(
"MultiDocAutoRetrieverPack", "./multidoc_autoretrieval_pack"
)
From here, you can use the pack. To initialize it, you need to define a few arguments, see below.
Then, you can set up the pack like so:
# setup pack arguments
from llama_index.core.vector_stores.types import MetadataInfo, VectorStoreInfo
import weaviate
# cloud
auth_config = weaviate.AuthApiKey(api_key="<api_key>")
client = weaviate.Client(
"https://<cluster>.weaviate.network",
auth_client_secret=auth_config,
)
vector_store_info = VectorStoreInfo(
content_info="Github Issues",
metadata_info=[
MetadataInfo(
name="state",
description="Whether the issue is `open` or `closed`",
type="string",
),
...,
],
)
# metadata_nodes is set of nodes with metadata representing each document
# docs is the source docs
# metadata_nodes and docs must be the same length
metadata_nodes = [TextNode(..., metadata={...}), ...]
docs = [Document(...), ...]
pack = MultiDocAutoRetrieverPack(
client,
"<metadata_index_name>",
"<doc_chunks_index_name>",
metadata_nodes,
docs,
vector_store_info,
auto_retriever_kwargs={
# any kwargs for the auto-retriever
...
},
)
The run()
function is a light wrapper around query_engine.query()
.
response = pack.run("Tell me a bout a Music celebritiy.")
You can also use modules individually.
# use the retreiver
retriever = pack.retriever
nodes = retriever.retrieve("query_str")
# use the query engine
query_engine = pack.query_engine
response = query_engine.query("query_str")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for llama_index_packs_multidoc_autoretrieval-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81faf11a95355fd556d72a36fb6c20c1284292eae870fe75127b5ff60e6e7b99 |
|
MD5 | 9d885ec94f721426b3edb2a84dd54301 |
|
BLAKE2b-256 | 3cba6da12fdf3f4f229ecca8863b5c523117446432f43cb914c9dddb0a8e8306 |
Hashes for llama_index_packs_multidoc_autoretrieval-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95b91d32beb713923ee0ec87b4bb5246f0e1a586e95f951925d8ca7a14cb932d |
|
MD5 | 4c5b943956d7a17fb623246bd9ac6ded |
|
BLAKE2b-256 | 45f65bc7edb5f9d52a4eb2ecf299f646b309889d142a825f75b3b5f0a28fa05a |