Skip to main content

MODE organizes documents into semantically coherent clusters and uses centroid-based retrieval to deliver scalable, efficient, and interpretable Retrieval-Augmented Generation without relying on large vector databases.

Project description

MODE: Mixture of Document Experts for RAG

Project Overview

MODE (Mixture of Document Experts) is an advanced framework that improves Retrieval-Augmented Generation (RAG) by integrating external knowledge retrieval with a mixture of specialized expert models.

Key features of MODE include:

  • Hierarchical Clustering: Organizes documents into semantically meaningful clusters.
  • Expert Models: Assigns specialized models to different document clusters for targeted expertise.
  • Centroid-Based Retrieval: Selects representative documents efficiently to enhance retrieval relevance.

By combining these techniques, MODE delivers more accurate document retrieval and synthesis for query-based applications, improving answer quality while reducing retrieval noise. MODE is particularly well-suited for small to medium-sized document collections or datasets.

📄 Docs:https://mode-rag.readthedocs.io/en/latest/

Quick start

Installation

pip install mode_rag
import os

## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-api-key"

1. Ingestion Code

This is a sample using RecursiveCharacterTextSplitter and EmbeddingGenerator. You can use your own chunking/embedding logic. Main inputs to ModeIngestion are chunks and embeddings:

# ========================================
# 📄 Sample Code: 
# ========================================
#
# 1. Loading pdf using PyPDFLoader
# 2. create chunking using `RecursiveCharacterTextSplitter`.
# 3. for embedding we are using langchain_huggingface.
# This is a sample using `RecursiveCharacterTextSplitter` and `EmbeddingGenerator`.
# You can use your **own chunking/embedding** logic.
# Main inputs to `ModeIngestion` are `chunks` and `embeddings`:


## requirements
# pip install langchain_huggingface==0.1.2
# pip install langchain_community==0.3.4
# pip install pypdf==5.1.0


import os
import json

os.environ["TOKENIZERS_PARALLELISM"] = "false"


from mode_rag import ModeIngestion, EmbeddingGenerator
import os
import json

## Pdf reader
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("https://arxiv.org/pdf/1706.03762")
docs = loader.load()

print("downloaded the files")

from langchain.text_splitter import RecursiveCharacterTextSplitter

print("Chunking the pdf:doc")
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(docs)
chunks = []
for doc in documents:
    chunks.append(doc.page_content)

print("doing embedding")
embed_gen = EmbeddingGenerator()
embeddings = embed_gen.generate_embeddings(chunks)
print("embedding done")
main_processor = ModeIngestion(
    chunks=chunks,
    embedding=embeddings,
    persist_directory="attention",
)
main_processor.process_data(parallel=False)

2. Inference Code

This is a sample using ModeInference and EmbeddingGenerator. You can use your own embedding method. Main inputs to ModeInference.invoke are query, query_embedding, and prompts:

# ========================================
# 📄 Sample Code:
# ========================================
#
# 1. Load clustered data (`ModeInference`).
# 2. Generate query embedding (replaceable with your `embedding.py`).
# 3. Retrieve context and synthesize response with `ModelPrompt`.

import os
import json
import sys

os.environ["TOKENIZERS_PARALLELISM"] = "false"


from mode_rag import (
    EmbeddingGenerator,
    ModeInference,
    ModelPrompt,
)


main_processor = ModeInference(
    persist_directory="attention",
)

print("====start======")
# Create a PromptManager instance

query = "What are the key mathematical operations involved in computing self-attention?"

embed_gen = EmbeddingGenerator()
embedding = embed_gen.generate_embedding(query)

prompts = ModelPrompt(
    ref_sys_prompt="Use the following pieces of context to answer the user's question. \nIf you don't know the answer, just return you don't know.",
    ref_usr_prompt="context: ",
    syn_sys_prompt="You have been provided with a set of responses from various models to the latest user query. Your task is to synthesize these responses into a single, high-quality response. It is crucial to critically evaluate the information provided in these responses, recognizing that some of it may be biased or incorrect. Your response should not simply replicate the given answers but should offer a refined, accurate, and comprehensive reply to the instruction. Ensure your response is well-structured, coherent, and adheres to the highest standards of accuracy and reliability.\nResponses from models:",
    syn_usr_prompt="responses:",
)

response = main_processor.invoke(
    query,
    embedding,
    prompts,
    model_input={"temperature": 0.3, "model": "openai/gpt-4o-mini"},
    top_n_model=2,
)
print(response)

Contributing

We welcome contributions! Here’s how you can help:

  • Report Bugs: Submit issues on GitHub.
  • Suggest Features: Open an issue with your ideas.
  • Code Contributions: Fork, make changes, and submit a pull request.
  • Documentation: Update and enhance our docs.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mode_rag-1.0.3.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mode_rag-1.0.3-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file mode_rag-1.0.3.tar.gz.

File metadata

  • Download URL: mode_rag-1.0.3.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.0 CPython/3.12.4 Darwin/24.4.0

File hashes

Hashes for mode_rag-1.0.3.tar.gz
Algorithm Hash digest
SHA256 bb50c0da756e2f3949742112b5fabeade4a7748dc3aab5290209ea7daf79c6c3
MD5 418812be58e392f096c7c21da50e65c9
BLAKE2b-256 8467997177a490c44e55f335510a7490519b4e9fb5f7298bbabc329bbec34183

See more details on using hashes here.

File details

Details for the file mode_rag-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: mode_rag-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.0 CPython/3.12.4 Darwin/24.4.0

File hashes

Hashes for mode_rag-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3a4b117036ad5ac70ed1423f6996ecfd9188f041340d0ce9695393b33685ea25
MD5 629a291f3bf2291e765c1091a9248d2d
BLAKE2b-256 f6341be43017392436c5134d3a57d8815ab60b7f2d5d3bbf1e9bb00964e353ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page