A modular framework for evaluating and optimizing RAG pipelines.
Project description
Ragmint
Ragmint (Retrieval-Augmented Generation Model Inspection & Tuning) is a modular, developer-friendly Python library for evaluating, optimizing, and tuning RAG (Retrieval-Augmented Generation) pipelines.
It provides a complete toolkit for retriever selection, embedding model tuning, automated RAG evaluation, and config-driven prebuilding of pipelines with support for Optuna-based Bayesian optimization, Auto-RAG tuning, chunking, and explainability through Gemini or Claude.
✨ Features
- ✅ Automated hyperparameter optimization (Grid, Random, Bayesian via Optuna).
- 🤖 Auto-RAG Tuner — dynamically recommends retriever–embedding pairs based on corpus size and document statistics, suggests multiple chunk sizes with overlaps, and can test configurations to identify the best-performing RAG setup.
- 🧠 Explainability Layer — interprets RAG performance via Gemini or Claude APIs.
- 🏆 Leaderboard Tracking — stores and ranks experiment runs via JSON or external DB.
- 🔍 Built-in RAG evaluation metrics — faithfulness, recall, BLEU, ROUGE, latency.
- 📦 Chunking system — automatic or configurable
chunk_sizeandoverlapfor documents with multiple suggested pairs. - ⚙️ Retrievers — FAISS, Chroma, scikit-learn.
- 🧩 Embeddings — Hugging Face.
- 💾 Caching, experiment tracking, and reproducibility out of the box.
- 🧰 Clean modular structure for easy integration in research and production setups.
- 🏗️ Langchain Prebuilder — prepares pipelines, applies chunking, embeddings, and vector store creation automatically.
- ⚙️ Config Adapter (LangchainConfigAdapter) — normalizes configuration, fills defaults, validates retrievers.
🚀 Quick Start
1️⃣ Installation
git clone https://github.com/andyolivers/ragmint.git
cd ragmint
pip install -e .
The
-eflag installs Ragmint in editable (development) mode.
Requires Python ≥ 3.9.
2️⃣ Run a RAG Optimization Experiment
python ragmint/main.py --config configs/default.yaml --search bayesian
Example configs/default.yaml:
retriever: faiss
embedding_model: text-embedding-3-small
chunk_size: 500
overlap: 100
reranker:
mode: mmr
lambda_param: 0.5
optimization:
search_method: bayesian
n_trials: 20
3️⃣ Manual Pipeline Usage
from ragmint.prebuilder import PreBuilder
from ragmint.tuner import RAGMint
# Prebuild pipeline (chunking, embeddings, vector store)
prebuilder = PreBuilder(
docs_path="data/docs/",
config_path="configs/default.yaml"
)
pipeline = prebuilder.build_pipeline()
# Initialize RAGMint with prebuilt components
rag = RAGMint(pipeline=pipeline)
# Run optimization
best, results = rag.optimize(validation_set=None, metric="faithfulness", trials=3)
print("Best configuration:", best)
🧩 Embeddings and Retrievers
Ragmint supports a flexible set of embeddings and retrievers, allowing you to adapt easily to various RAG architectures.
🧩 Chunking System
- Automatically splits documents into chunks with
chunk_sizeandoverlapparameters. - Supports default values if not provided in configuration.
- Optimized for downstream retrieval and embeddings.
- Enables adaptive chunking strategies in future releases.
🧩 Langchain Config Adapter
- Ensures consistent configuration across pipeline components.
- Normalizes retriever and embedding names (e.g.,
faiss,sentence-transformers/...). - Adds default chunk parameters when missing.
- Validates retriever backends and raises clear errors for unsupported options.
🧩 Langchain Prebuilder
Automates pipeline preparation:
- Reads documents
- Applies chunking
- Creates embeddings
- Initializes retriever / vector store
- Returns ready-to-use pipeline** for RAGMint or custom usage.
🔤 Available Embeddings (Hugging Face)
You can select from the following models:
sentence-transformers/all-MiniLM-L6-v2— lightweight, general-purposesentence-transformers/all-mpnet-base-v2— higher accuracy, slowerBAAI/bge-base-en-v1.5— multilingual, dense embeddingsintfloat/multilingual-e5-base— ideal for multilingual corpora
Configuration Example
Use the following format in your config file to specify the embedding model:
embedding_model: sentence-transformers/all-MiniLM-L6-v2
🔍 Available Retrievers
Ragmint integrates multiple retrieval backends to suit different needs:
| Retriever | Description |
|---|---|
| FAISS | Fast vector similarity search; efficient for dense embeddings |
| Chroma | Persistent vector DB; works well for incremental updates |
| scikit-learn (NearestNeighbors) | Lightweight, zero-dependency local retriever |
Configuration Example
To specify the retriever in your configuration file, use the following format:
retriever: faiss
🧪 Dataset Options
Ragmint can automatically load evaluation datasets for your RAG pipeline:
| Mode | Example | Description |
|---|---|---|
| 🧱 Default | validation_set=None |
Uses built-in experiments/validation_qa.json |
| 📁 Custom File | validation_set="data/my_eval.json" |
Load your own QA dataset (JSON or CSV) |
| 🌐 Hugging Face Dataset | validation_set="squad" |
Automatically downloads benchmark datasets (requires pip install datasets) |
Example
from ragmint.tuner import RAGMint
ragmint = RAGMint(
docs_path="data/docs/",
retrievers=["faiss", "chroma"],
embeddings=["text-embedding-3-small"],
rerankers=["mmr"],
)
# Use built-in default
ragmint.optimize(validation_set=None)
# Use Hugging Face benchmark
ragmint.optimize(validation_set="squad")
# Use your own dataset
ragmint.optimize(validation_set="data/custom_qa.json")
🧠 Auto-RAG Tuner
The AutoRAGTuner automatically analyzes your corpus and recommends retriever–embedding combinations based on corpus statistics (size and average document length). It also suggests multiple chunk sizes with overlaps to improve retrieval performance.
Beyond recommendations, it can run full end-to-end testing of the suggested configurations and identify the best-performing RAG setup for your dataset.
from ragmint.autotuner import AutoRAGTuner
# Initialize with your documents
tuner = AutoRAGTuner(docs_path="data/docs/")
# Recommend configurations and suggest chunk sizes
recommendation = tuner.recommend(num_chunk_pairs=5)
print("Initial recommendation:", recommendation)
# Run full auto-tuning on validation set
best_config, results = tuner.auto_tune(validation_set="data/validation.json", trials=5)
print("Best configuration after testing:", best_config)
print("All trial results:", results)
🏆 Leaderboard Tracking
Track and visualize your best experiments across runs.
from ragmint.leaderboard import Leaderboard
lb = Leaderboard("experiments/leaderboard.json")
lb.add_entry({"trial": 1, "faithfulness": 0.87, "latency": 0.12})
lb.show_top(3)
🧠 Explainability with Gemini / Claude
Compare two RAG configurations and receive natural language insights on why one performs better.
from ragmint.explainer import explain_results
config_a = {"retriever": "FAISS", "embedding_model": "OpenAI"}
config_b = {"retriever": "Chroma", "embedding_model": "SentenceTransformers"}
explanation = explain_results(config_a, config_b, model="gemini")
print(explanation)
Set your API keys in a
.envfile or via environment variables:export GEMINI_API_KEY="your_gemini_key" export ANTHROPIC_API_KEY="your_claude_key"
🧩 Folder Structure
ragmint/
├── core/
│ ├── pipeline.py
│ ├── retriever.py
│ ├── reranker.py
│ ├── embeddings.py
│ ├── chunking.py
│ └── evaluation.py
├── integration/
│ ├── config_adapter.py
│ └── langchain_prebuilder.py
├── autotuner.py
├── explainer.py
├── leaderboard.py
├── tuner.py
├── utils/
├── configs/
├── experiments/
├── tests/
└── main.py
🧪 Running Tests
pytest -v
To include integration tests with Gemini or Claude APIs:
pytest -m integration
⚙️ Configuration via pyproject.toml
Your pyproject.toml includes all required dependencies:
[project]
name = "ragmint"
version = "0.1.0"
dependencies = [
# Core ML + Embeddings
"numpy<2.0.0",
"pandas>=2.0",
"scikit-learn>=1.3",
"sentence-transformers>=2.2.2",
# Retrieval backends
"chromadb>=0.4",
"faiss-cpu; sys_platform != 'darwin'", # For Linux/Windows
"faiss-cpu==1.7.4; sys_platform == 'darwin'", # Optional fix for macOS MPS
"rank-bm25>=0.2.2", # For BM25 retriever
# Optimization & evaluation
"optuna>=3.0",
"tqdm",
"colorama",
# RAG evaluation and data utils
"pyyaml",
"python-dotenv",
# Explainability and LLM APIs
"openai>=1.0.0",
"google-generativeai>=0.8.0",
"anthropic>=0.25.0",
# Integration / storage
"supabase>=2.4.0",
# Testing
"pytest",
# LangChain integration layer
"langchain>=0.2.5",
"langchain-community>=0.2.5",
"langchain-text-splitters>=0.2.1"
]
📊 Example Experiment Workflow
- Define your retriever, embedding, and reranker setup
- Launch optimization (Grid, Random, Bayesian) or AutoTune
- Compare performance with explainability
- Persist results to leaderboard for later inspection
🧬 Architecture Overview
flowchart TD
A[Query] --> B[Chunking / Preprocessing]
B --> C[Embedder]
C --> D[Retriever]
D --> E[Reranker]
E --> F[Generator]
F --> G[Evaluation]
G --> H[AutoRAGTuner / Optuna]
H --> I[Suggested Configs & Chunk Sizes]
I --> J[Best Configuration]
J -->|Update Params| C
📘 Example Output
[INFO] Starting Auto-RAG Tuning
[INFO] Suggested retriever=Chroma, embedding_model=sentence-transformers/all-MiniLM-L6-v2
[INFO] Suggested chunk-size candidates: [(380, 80), (420, 100), (350, 70), (400, 90), (360, 75)]
[INFO] Running full evaluation on validation set with 5 trials
[INFO] Trial 1 finished: faithfulness=0.82, latency=0.40s
[INFO] Trial 2 finished: faithfulness=0.85, latency=0.44s
...
[INFO] Best configuration after testing: {'retriever': 'Chroma', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'chunk_size': 400, 'overlap': 90, 'strategy': 'sentence'}
🧠 Why Ragmint?
- Built for RAG researchers, AI engineers, and LLM ops
- Works with LangChain, LlamaIndex, or standalone setups
- Designed for extensibility — plug in your own retrievers, models, or metrics
- Integrated explainability and leaderboard modules for research and production
⚖️ License
Licensed under the Apache License 2.0 — free for personal, research, and commercial use.
👤 Author
André Oliveira
andyolivers.com
Data Scientist | AI Engineer
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragmint-0.4.0.tar.gz.
File metadata
- Download URL: ragmint-0.4.0.tar.gz
- Upload date:
- Size: 31.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2349f456118a560cf4cef8c0b7be0906391ee5856833c73dbbbad5d65dddb50b
|
|
| MD5 |
83b46e5249fcef2e10aa706eed47be1e
|
|
| BLAKE2b-256 |
ee5801717962bf82789c9b395712b3ad6f0875ef97725b3bf7e096e4a9b768b5
|
File details
Details for the file ragmint-0.4.0-py3-none-any.whl.
File metadata
- Download URL: ragmint-0.4.0-py3-none-any.whl
- Upload date:
- Size: 42.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0eda364ce22e6a4320212adafd74c046f3982760cb593d6427c1483a605993ec
|
|
| MD5 |
0fa1a1c10ca119978e9741a682e158a0
|
|
| BLAKE2b-256 |
45ba1f72b06649bf5ec9eb0c9b22c3f830f498dfce64eb8a6261df80fe256c1d
|