Skip to main content

This library is to search the best parameters across different steps of the RAG process.

Project description


RAG-X Library

Overview

RAG-X is a comprehensive library designed to optimize Retrieval-Augmented Generation (RAG) processes. It provides a suite of tools to automatically determine the best parameters for processing specific documents. This includes selecting appropriate chunking techniques, embedding models, vector databases, and Language Model (LLM) configurations.

Key Features:

  • Adaptive Chunking: Incorporates four advanced text chunking methodologies to enhance the handling of diverse document structures.
    • Specific Text Splitting
    • Recursive Text Splitting
    • Sentence Window Splitting
    • Semantic Window Splitting
  • Expandability: Future versions will introduce additional chunking strategies and enhancements based on user feedback and ongoing research.
  • Compatibility: Designed to seamlessly integrate with a wide range of embedding models and vector databases.

Getting Started

Prerequisites

Due to existing dependency conflicts, it is crucial to install the required dependencies before using the RAG-X library. We are actively working on a resolution and appreciate your understanding.

pip install tiktoken chromadb trulens-eval 'unstructured[pdf]' openai -q

Installation

After resolving the dependencies, install the RAG-X library using the following command:

pip install -i https://test.pypi.org/simple/ RAG-X -q

To verify the installation and view library details, execute:

pip show RAG-X

Setting Up Your Environment

Before diving into the functionality of RAG-X, ensure that your environment variables are properly configured with your OpenAI API key and your Hugging Face token:

import os

os.environ['OPENAI_API_KEY'] = "YOUR_OPENAI_API_KEY"
os.environ['HF_TOKEN'] = "YOUR_HUGGINGFACE_TOKEN"

Usage

The following steps guide you through the process of utilizing the RAG-X library to optimize your RAG parameters:

from RAG_X.prag import parent_class

# Specify the path to your PDF document
file_path = "PATH_TO_YOUR_PDF_FILE"

# Initialize the RAG-X instance
my_instance = parent_class(file_path)

# Generate the optimal RAG parameters for your document
score_card = my_instance.get_best_param()

# Output the results
print(score_card)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

testpackage12121-0.0.1.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

testpackage12121-0.0.1-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file testpackage12121-0.0.1.tar.gz.

File metadata

  • Download URL: testpackage12121-0.0.1.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for testpackage12121-0.0.1.tar.gz
Algorithm Hash digest
SHA256 26cb1dfe1e978fba9224a65d8cc4f33e02cf1f84f7da0b9cfb1d3a84cbf2d5d9
MD5 b31270d795808a7c2785839e10f0e4b8
BLAKE2b-256 58711a4ceeb02237d0bae81d51074f4f6acf233bff230661f234289d7a43e2d2

See more details on using hashes here.

File details

Details for the file testpackage12121-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for testpackage12121-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 33cefe3074792c53c5dcd7c37564f433642affc73223e9b1bd0cc2834af5a669
MD5 e781a21c520c832f533f052a69068083
BLAKE2b-256 24560a65f285377e0e2a1adba57e28eb7a1e30cb60d7658d7941637d9cd38e5e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page