llama-index packs subdoc-summary implementation
Project description
LlamaIndex Packs Integration: Subdoc-Summary
This LlamaPack provides an advanced technique for injecting each chunk with "sub-document" metadata. This context augmentation technique is helpful for both retrieving relevant context and for synthesizing correct answers.
It is a step beyond simply adding a summary of the document as the metadata to each chunk. Within a long document, there can be multiple distinct themes, and we want each chunk to be grounded in global but relevant context.
This technique was inspired by our "Practical Tips and Tricks" video: https://www.youtube.com/watch?v=ZP1F9z-S7T0.
Installation
pip install llama-index llama-index-packs-subdoc-summary
CLI Usage
You can download llamapacks directly using llamaindex-cli
, which comes installed with the llama-index
python package:
llamaindex-cli download-llamapack SubDocSummaryPack --download-dir ./subdoc_summary_pack
You can then inspect the files at ./subdoc_summary_pack
and use them as a template for your own project.
Code Usage
You can download the pack to a the ./subdoc_summary_pack
directory:
from llama_index.core.llama_pack import download_llama_pack
# download and install dependencies
SubDocSummaryPack = download_llama_pack(
"SubDocSummaryPack", "./subdoc_summary_pack"
)
# You can use any llama-hub loader to get documents!
subdoc_summary_pack = SubDocSummaryPack(
documents,
parent_chunk_size=8192, # default,
child_chunk_size=512, # default
llm=OpenAI(model="gpt-3.5-turbo"),
embed_model=OpenAIEmbedding(),
)
Initializing the pack will split documents into parent chunks and child chunks. It will inject parent chunk summaries into child chunks, and index the child chunks.
Running the pack will run the query engine over the vectorized child chunks.
response = subdoc_summary_pack.run("<query>", similarity_top_k=2)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file llama_index_packs_subdoc_summary-0.3.0.tar.gz
.
File metadata
- Download URL: llama_index_packs_subdoc_summary-0.3.0.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.10 Darwin/22.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8c12196f25856737eaffde80ad6202eacab123c26198394b3c83756b09febe5 |
|
MD5 | e668e5602e6b02704aecd870e009c11d |
|
BLAKE2b-256 | eb96e294cb0f033ad1821304dfcd46b5a114e093cd938030c38d6cd21dbc7788 |
File details
Details for the file llama_index_packs_subdoc_summary-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: llama_index_packs_subdoc_summary-0.3.0-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.10 Darwin/22.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 436ecad916fd8f522e5a3278022022a8872035b7f9800b46ffea769e49ae9312 |
|
MD5 | 8fbc53fa39aa697ffece58d37dac0cda |
|
BLAKE2b-256 | 288ed1603d398379e386e53d1e2439eb882f47b36615df275b1e2bbc95328913 |