llama-index packs multidoc_autoretrieval integration
Project description
Multi-Document AutoRetrieval (with Weaviate) Pack
This LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.
CLI Usage
You can download llamapacks directly using llamaindex-cli
, which comes installed with the llama-index
python package:
llamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack
You can then inspect the files at ./multidoc_autoretrieval_pack
and use them as a template for your own project!
Code Usage
You can download the pack to a the ./multidoc_autoretrieval_pack
directory:
from llama_index.core.llama_pack import download_llama_pack
# download and install dependencies
MultiDocAutoRetrieverPack = download_llama_pack(
"MultiDocAutoRetrieverPack", "./multidoc_autoretrieval_pack"
)
From here, you can use the pack. To initialize it, you need to define a few arguments, see below.
Then, you can set up the pack like so:
# setup pack arguments
from llama_index.core.vector_stores.types import MetadataInfo, VectorStoreInfo
import weaviate
# cloud
auth_config = weaviate.AuthApiKey(api_key="<api_key>")
client = weaviate.Client(
"https://<cluster>.weaviate.network",
auth_client_secret=auth_config,
)
vector_store_info = VectorStoreInfo(
content_info="Github Issues",
metadata_info=[
MetadataInfo(
name="state",
description="Whether the issue is `open` or `closed`",
type="string",
),
...,
],
)
# metadata_nodes is set of nodes with metadata representing each document
# docs is the source docs
# metadata_nodes and docs must be the same length
metadata_nodes = [TextNode(..., metadata={...}), ...]
docs = [Document(...), ...]
pack = MultiDocAutoRetrieverPack(
client,
"<metadata_index_name>",
"<doc_chunks_index_name>",
metadata_nodes,
docs,
vector_store_info,
auto_retriever_kwargs={
# any kwargs for the auto-retriever
...
},
)
The run()
function is a light wrapper around query_engine.query()
.
response = pack.run("Tell me a bout a Music celebritiy.")
You can also use modules individually.
# use the retriever
retriever = pack.retriever
nodes = retriever.retrieve("query_str")
# use the query engine
query_engine = pack.query_engine
response = query_engine.query("query_str")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for llama_index_packs_multidoc_autoretrieval-0.1.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | e715972dc806160a5c6980007c6af55f6ed5d5674d1c2d37d268672b2c5ac2e1 |
|
MD5 | 25f13df09d0cebb7911140c5d56c21fa |
|
BLAKE2b-256 | be3f18388859f55d27284c7da7d362ba88b607f6ecef7245f21cc982ec08fdab |
Hashes for llama_index_packs_multidoc_autoretrieval-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9e6707beacfd48bb5cfd1b3c438da0747d180562dbe57af247c166545e15bd5 |
|
MD5 | 1e3c5a95ad099a8ec2cfc70b7fcddb29 |
|
BLAKE2b-256 | ef7bd98cb99c14687702da91080c7b1fa2280fc1e357017a3c8dc7e0bbaf0680 |