This library is to search the best parameters across different steps of the RAG process.
Project description
RAG-X Library
Overview
RAG-X is a comprehensive library designed to optimize Retrieval-Augmented Generation (RAG) processes. It provides a suite of tools to automatically determine the best parameters for processing specific documents. This includes selecting appropriate chunking techniques, embedding models, vector databases, and Language Model (LLM) configurations.
Key Features:
- Adaptive Chunking: Incorporates four advanced text chunking methodologies to enhance the handling of diverse document structures.
- Specific Text Splitting
- Recursive Text Splitting
- Sentence Window Splitting
- Semantic Window Splitting
- Expandability: Future versions will introduce additional chunking strategies and enhancements based on user feedback and ongoing research.
- Compatibility: Designed to seamlessly integrate with a wide range of embedding models and vector databases.
Getting Started
Prerequisites
Due to existing dependency conflicts, it is crucial to install the required dependencies before using the RAG-X library. We are actively working on a resolution and appreciate your understanding.
pip install tiktoken chromadb trulens-eval 'unstructured[pdf]' openai -q
Installation
After resolving the dependencies, install the RAG-X library using the following command:
pip install -i https://test.pypi.org/simple/ RAG-X -q
To verify the installation and view library details, execute:
pip show RAG-X
Setting Up Your Environment
Before diving into the functionality of RAG-X, ensure that your environment variables are properly configured with your OpenAI API key and your Hugging Face token:
import os
os.environ['OPENAI_API_KEY'] = "YOUR_OPENAI_API_KEY"
os.environ['HF_TOKEN'] = "YOUR_HUGGINGFACE_TOKEN"
Usage
The following steps guide you through the process of utilizing the RAG-X library to optimize your RAG parameters:
from RAG_X.prag import parent_class
# Specify the path to your PDF document
file_path = "PATH_TO_YOUR_PDF_FILE"
# Initialize the RAG-X instance
my_instance = parent_class(file_path)
# Generate the optimal RAG parameters for your document
score_card = my_instance.get_best_param()
# Output the results
print(score_card)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for testpackage12121-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33cefe3074792c53c5dcd7c37564f433642affc73223e9b1bd0cc2834af5a669 |
|
MD5 | e781a21c520c832f533f052a69068083 |
|
BLAKE2b-256 | 24560a65f285377e0e2a1adba57e28eb7a1e30cb60d7658d7941637d9cd38e5e |