A Multiagent Framework for Generating Multimodal Multihop QA Datasets for RAG Evaluation
Project description
MiRAGE: Multimodal Multihop RAG Evaluation Dataset Generator
MiRAGE is a multi-agent framework for generating high-quality, multimodal, multihop question-answer datasets for evaluating Retrieval-Augmented Generation (RAG) systems.
Key Features
- Multi-hop Context Completion: Iteratively expands incomplete chunks with relevant context
- Domain and Expert Role Detection: Automatic domain identification using BERTopic + LLM
- Multi-stage QA Pipeline: Generate, Select, Verify, Correct for quality assurance
- Multimodal Support: Handles text, tables, figures, and images
- Multiple Backend Support: Gemini, OpenAI, and local Ollama models
- Fully Parallelized: Thread and process pools for maximum throughput
Table of Contents
- Installation
- Quick Start
- Usage
- API Keys Setup
- Configuration
- Command Line Options
- Output Format
- Project Structure
- Contributing
- License
Installation
From PyPI
pip install mirage-benchmark
From Source
git clone https://github.com/ChandanKSahu/MiRAGE.git
cd MiRAGE
pip install -e .
With Optional Dependencies
pip install mirage-benchmark[gpu] # GPU support
pip install mirage-benchmark[pdf] # PDF processing
pip install mirage-benchmark[all] # All dependencies
Quick Start
Step 1: Set Up API Key
Choose one of the following backends:
Option A: Google Gemini (Recommended)
export GEMINI_API_KEY="your-gemini-api-key"
Option B: OpenAI
export OPENAI_API_KEY="your-openai-api-key"
Option C: Local Ollama (No API key needed)
# Install and start Ollama
ollama serve
ollama pull llama3
Step 2: Prepare Your Data
Place your documents in a folder:
mkdir -p data/my_documents
cp /path/to/your/*.pdf data/my_documents/
Step 3: Run MiRAGE
# Basic usage
python run_mirage.py --input data/my_documents --output output/my_dataset
# With API key as argument
python run_mirage.py -i data/my_documents -o output/my_dataset --api-key YOUR_API_KEY
# Using OpenAI
python run_mirage.py -i data/my_documents -o output/my_dataset --backend openai
# Using local Ollama
python run_mirage.py -i data/my_documents -o output/my_dataset --backend ollama
Step 4: Check Results
ls output/my_dataset/
# qa_deduplicated.json - Final QA dataset
# chunks.json - Semantic chunks
# evaluation_report.json - Quality metrics
Usage
Basic Usage
python run_mirage.py --input <INPUT_DIR> --output <OUTPUT_DIR>
With All Options
python run_mirage.py \
--input data/documents \
--output output/results \
--backend gemini \
--api-key YOUR_API_KEY \
--num-qa-pairs 100 \
--max-workers 4 \
--verbose
Run Preflight Checks
Before running the full pipeline, verify your setup:
python run_mirage.py --preflight
Using Sample Dataset
A sample dataset is included for testing:
# Unzip sample data
unzip data/FinanceAnnualReports.zip -d data/sample/
# Run on sample
python run_mirage.py -i data/sample -o output/sample_results
API Keys Setup
Google Gemini
- Get API key from: https://makersuite.google.com/app/apikey
- Set environment variable:
export GEMINI_API_KEY="your-key-here"
Or create a file:
mkdir -p ~/.config/gemini
echo "your-key-here" > ~/.config/gemini/api_key.txt
OpenAI
- Get API key from: https://platform.openai.com/api-keys
- Set environment variable:
export OPENAI_API_KEY="your-key-here"
Ollama (Local - Free)
No API key needed! Just install Ollama:
# Install
curl -fsSL https://ollama.com/install.sh | sh
# Start server
ollama serve
# Pull models
ollama pull llama3 # For text
ollama pull llava # For vision
Configuration
Using config.yaml
Copy the example config and customize:
cp config.yaml.example config.yaml
Edit config.yaml:
backend:
active: GEMINI # GEMINI, OPENAI, or OLLAMA
gemini:
api_key_path: ~/.config/gemini/api_key.txt
llm_model: gemini-2.0-flash
vlm_model: gemini-2.0-flash
openai:
api_key_path: ~/.config/openai/api_key.txt
llm_model: gpt-4o
vlm_model: gpt-4o
ollama:
base_url: http://localhost:11434
llm_model: llama3
vlm_model: llava
paths:
input_pdf_dir: data/documents
output_dir: output/results
qa_generation:
target_qa_pairs: 100
max_workers: 4
Then run:
python run_mirage.py --config config.yaml
Command Line Options
| Option | Short | Description | Default |
|---|---|---|---|
--input |
-i |
Input directory with documents | Required |
--output |
-o |
Output directory for results | Required |
--api-key |
-k |
API key for LLM backend | From env |
--backend |
-b |
Backend: gemini, openai, ollama | gemini |
--model |
Model name | Auto | |
--config |
-c |
Config file path | config.yaml |
--num-qa-pairs |
Target QA pairs to generate | 100 | |
--max-workers |
Parallel workers | 4 | |
--preflight |
Run preflight checks only | - | |
--skip-preflight |
Skip preflight checks | - | |
--skip-pdf-processing |
Skip PDF conversion | - | |
--skip-chunking |
Skip chunking step | - | |
--verbose |
-v |
Verbose output | - |
--version |
Show version | - | |
--help |
-h |
Show help | - |
Output Format
Generated Files
output/my_dataset/
├── markdown/ # Converted markdown files
├── chunks.json # Semantic chunks
├── qa_dataset.json # Raw QA pairs
├── qa_deduplicated.json # Final deduplicated QA pairs
├── evaluation_report.json # Quality metrics
└── run_config.json # Run configuration
QA Dataset Structure
{
"chunk_id": 1,
"question": "What is the company's revenue growth?",
"answer": "The company achieved 15% revenue growth...",
"context_chunks": [...],
"hop_count": 2,
"relevance_score": "9",
"difficulty_score": "7",
"expert_persona": "Financial Analyst",
"domain": "Finance"
}
Project Structure
MiRAGE/
├── src/mirage/ # Main package
│ ├── core/ # LLM interfaces, prompts, config
│ ├── embeddings/ # Embedding models, rerankers
│ ├── pipeline/ # PDF processing, QA generation
│ ├── evaluation/ # Metrics
│ └── utils/ # Utilities
├── data/ # Your documents
│ └── documents/ # Input folder
├── output/ # Generated results
├── config.yaml.example # Example configuration
├── run_mirage.py # Main entry point
└── README.md
Examples
Generate QA from PDFs
# Using Gemini
export GEMINI_API_KEY="your-key"
python run_mirage.py -i data/pdfs -o output/qa_dataset
# Using OpenAI
export OPENAI_API_KEY="your-key"
python run_mirage.py -i data/pdfs -o output/qa_dataset --backend openai
# Using Ollama (local, free)
python run_mirage.py -i data/pdfs -o output/qa_dataset --backend ollama
Generate More QA Pairs
python run_mirage.py -i data/documents -o output/large_dataset --num-qa-pairs 500
Use More Workers
python run_mirage.py -i data/documents -o output/fast_run --max-workers 8
Skip Already Processed Steps
# If you already have markdown files
python run_mirage.py -i data/documents -o output/results --skip-pdf-processing
# If you already have chunks
python run_mirage.py -i data/documents -o output/results --skip-chunking
Troubleshooting
API Key Issues
# Check if API key is set
echo $GEMINI_API_KEY
# Set it if missing
export GEMINI_API_KEY="your-key"
Import Errors
# Reinstall package
pip install -e .
Preflight Check Failures
# Run verbose preflight
python run_mirage.py --preflight --verbose
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
See CONTRIBUTING.md for details.
Citation
@software{mirage2024,
title = {MiRAGE: A Multiagent Framework for Generating Multimodal Multihop QA Datasets},
author = {MiRAGE Authors},
year = {2024},
url = {https://github.com/ChandanKSahu/MiRAGE}
}
License
Apache License 2.0 - see LICENSE
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mirage_benchmark-1.0.2.tar.gz.
File metadata
- Download URL: mirage_benchmark-1.0.2.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6d645706b7c0e7748856528ccf313e3a35421fd658264e59e368bf46ebffb3b
|
|
| MD5 |
572d68f85617a56498a09ec88d0d18e6
|
|
| BLAKE2b-256 |
cde8f6e9375ebb3dc82ad5c347807c7f3abc9a7186f5db86e525c26021ec6068
|
Provenance
The following attestation bundles were made for mirage_benchmark-1.0.2.tar.gz:
Publisher:
publish-pypi.yml on ChandanKSahu/MiRAGE
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mirage_benchmark-1.0.2.tar.gz -
Subject digest:
a6d645706b7c0e7748856528ccf313e3a35421fd658264e59e368bf46ebffb3b - Sigstore transparency entry: 797332493
- Sigstore integration time:
-
Permalink:
ChandanKSahu/MiRAGE@5dc6b8b08102b63c891726b8048521cd974dadfd -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/ChandanKSahu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@5dc6b8b08102b63c891726b8048521cd974dadfd -
Trigger Event:
release
-
Statement type:
File details
Details for the file mirage_benchmark-1.0.2-py3-none-any.whl.
File metadata
- Download URL: mirage_benchmark-1.0.2-py3-none-any.whl
- Upload date:
- Size: 148.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4561b9c5a77105828a1ffee3fbfc8ffb880b1f9882448d3b86aa5516e4686ca
|
|
| MD5 |
977f7e72149c08dfb44b5df28f2f6ef8
|
|
| BLAKE2b-256 |
7714308b369cfdcbe6e9d515e0f9c8e9c758569f96c3200c965f44b05df1f74a
|
Provenance
The following attestation bundles were made for mirage_benchmark-1.0.2-py3-none-any.whl:
Publisher:
publish-pypi.yml on ChandanKSahu/MiRAGE
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mirage_benchmark-1.0.2-py3-none-any.whl -
Subject digest:
b4561b9c5a77105828a1ffee3fbfc8ffb880b1f9882448d3b86aa5516e4686ca - Sigstore transparency entry: 797332500
- Sigstore integration time:
-
Permalink:
ChandanKSahu/MiRAGE@5dc6b8b08102b63c891726b8048521cd974dadfd -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/ChandanKSahu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@5dc6b8b08102b63c891726b8048521cd974dadfd -
Trigger Event:
release
-
Statement type: