llama-index packs multidoc_autoretrieval integration
Project description
Multi-Document AutoRetrieval (with Weaviate) Pack
This LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.
CLI Usage
You can download llamapacks directly using llamaindex-cli, which comes installed with the llama-index python package:
llamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack
You can then inspect the files at ./multidoc_autoretrieval_pack and use them as a template for your own project!
Code Usage
You can download the pack to a the ./multidoc_autoretrieval_pack directory:
from llama_index.core.llama_pack import download_llama_pack
# download and install dependencies
MultiDocAutoRetrieverPack = download_llama_pack(
"MultiDocAutoRetrieverPack", "./multidoc_autoretrieval_pack"
)
From here, you can use the pack. To initialize it, you need to define a few arguments, see below.
Then, you can set up the pack like so:
# setup pack arguments
from llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo
import weaviate
# cloud
auth_config = weaviate.AuthApiKey(api_key="<api_key>")
client = weaviate.Client(
"https://<cluster>.weaviate.network",
auth_client_secret=auth_config,
)
vector_store_info = VectorStoreInfo(
content_info="Github Issues",
metadata_info=[
MetadataInfo(
name="state",
description="Whether the issue is `open` or `closed`",
type="string",
),
...,
],
)
# metadata_nodes is set of nodes with metadata representing each document
# docs is the source docs
# metadata_nodes and docs must be the same length
metadata_nodes = [TextNode(..., metadata={...}), ...]
docs = [Document(...), ...]
pack = MultiDocAutoRetrieverPack(
client,
"<metadata_index_name>",
"<doc_chunks_index_name>",
metadata_nodes,
docs,
vector_store_info,
auto_retriever_kwargs={
# any kwargs for the auto-retriever
...
},
)
The run() function is a light wrapper around query_engine.query().
response = pack.run("Tell me a bout a Music celebritiy.")
You can also use modules individually.
# use the retriever
retriever = pack.retriever
nodes = retriever.retrieve("query_str")
# use the query engine
query_engine = pack.query_engine
response = query_engine.query("query_str")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llama_index_packs_multidoc_autoretrieval-0.3.1.tar.gz.
File metadata
- Download URL: llama_index_packs_multidoc_autoretrieval-0.3.1.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d70a9300cb6daab282cf319ed8962cf49dcc311086ed089db6ad2a486364a053
|
|
| MD5 |
bf16d308e3cc408b20bd20ad7f3325ab
|
|
| BLAKE2b-256 |
6a17f6ee94dd3c9e39ad60a883ddfbd3b7e7c6d28b5172768d1b1b2208d495bb
|
File details
Details for the file llama_index_packs_multidoc_autoretrieval-0.3.1-py3-none-any.whl.
File metadata
- Download URL: llama_index_packs_multidoc_autoretrieval-0.3.1-py3-none-any.whl
- Upload date:
- Size: 5.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a1b4c6f908b08b7da82b619ba352a5e47638b640942520e5a4eb920e712c710
|
|
| MD5 |
dd0ff87e91f71a93957720b19b598da3
|
|
| BLAKE2b-256 |
2da59bb99c8e47aadc7209e85f6ed215d8de274e40f7338f2c36ca947ca1fca3
|