A modular framework for evaluating and optimizing RAG pipelines.

These details have not been verified by PyPI

Project links

Project description

Ragmint

Ragmint Banner

Python License Tests Optuna Status PyPI Docs Gradio

Ragmint (Retrieval-Augmented Generation Model Inspection & Tuning) is a modular, developer-friendly Python library for evaluating, optimizing, and tuning RAG (Retrieval-Augmented Generation) pipelines.

It provides a complete toolkit for retriever selection, embedding model tuning, automated RAG evaluation, and config-driven prebuilding of pipelines with support for Optuna-based Bayesian optimization, Auto-RAG tuning, chunking, and explainability through Gemini or Claude.

✨ Features

✅ Automated hyperparameter optimization (Grid, Random, Bayesian via Optuna).
🤖 Auto-RAG Tuner — dynamically recommends retriever–embedding pairs based on corpus size and document statistics, suggests multiple chunk sizes with overlaps, and can test configurations to identify the best-performing RAG setup.
🧮 Validation QA Generator — automatically creates QA datasets from a corpus for evaluating and tuning RAG pipelines when no labeled data is available.
🧠 Explainability Layer — interprets RAG performance via Gemini or Claude APIs.
🏆 Leaderboard Tracking — stores and ranks experiment runs via JSON or external DB.
🔍 Built-in RAG evaluation metrics — faithfulness, recall, BLEU, ROUGE, latency.
📦 Chunking system — automatic or configurable chunk_size and overlap for documents with multiple suggested pairs.
⚙️ Retrievers — FAISS, Chroma, scikit-learn.
🧩 Embeddings — Hugging Face.
💾 Caching, experiment tracking, and reproducibility out of the box.
🧰 Clean modular structure for easy integration in research and production setups.
🏗️ Langchain Prebuilder — prepares pipelines, applies chunking, embeddings, and vector store creation automatically.
⚙️ Config Adapter (LangchainConfigAdapter) — normalizes configuration, fills defaults, validates retrievers.

🚀 Quick Start

Installation

git clone https://github.com/andyolivers/ragmint.git
cd ragmint
pip install -e .
python -m ragmint.app

The -e flag installs Ragmint in editable (development) mode.
Requires Python ≥ 3.9.

Installation via PyPI

pip install ragmint

2️⃣ Run a RAG Optimization Experiment

python ragmint/main.py --config configs/default.yaml --search bayesian

Example configs/default.yaml:

retriever: faiss
embedding_model: text-embedding-3-small
chunk_size: 500
overlap: 100
reranker:
  mode: mmr
  lambda_param: 0.5
optimization:
  search_method: bayesian
  n_trials: 20

3️⃣ Manual Pipeline Usage

from ragmint.prebuilder import PreBuilder
from ragmint.tuner import RAGMint

# Prebuild pipeline (chunking, embeddings, vector store)
prebuilder = PreBuilder(
    docs_path="data/docs/",
    config_path="configs/default.yaml"
)
pipeline = prebuilder.build_pipeline()

# Initialize RAGMint with prebuilt components
rag = RAGMint(pipeline=pipeline)

# Run optimization
best, results = rag.optimize(validation_set=None, metric="faithfulness", trials=3)
print("Best configuration:", best)

🧩 Embeddings and Retrievers

Ragmint supports a flexible set of embeddings and retrievers, allowing you to adapt easily to various RAG architectures.

🧩 Chunking System

Automatically splits documents into chunks with chunk_size and overlap parameters.
Supports default values if not provided in configuration.
Optimized for downstream retrieval and embeddings.
Enables adaptive chunking strategies in future releases.

🧮 Validation QA Generator

The QA Generator module automatically creates question–answer (QA) validation datasets from any corpus of .txt documents.
This dataset can be used to evaluate and tune RAG pipelines inside Ragmint when no labeled data is available.

✨ Key Capabilities

🔁 Batch processing — splits large corpora into batches to prevent token overflows and API timeouts.
🧠 Topic-aware question estimation — dynamically determines how many questions to generate per document based on:
- Document length (logarithmic scaling)
- Topic diversity (via SentenceTransformer + KMeans clustering)
🤖 LLM-powered QA synthesis — generates factual QA pairs using Gemini or Claude models.
💾 Automatic JSON export — saves the generated dataset to experiments/validation_qa.json (configurable).

⚙️ Usage

You can run the generator directly from the command line:

python -m ragmint.qa_generator --density 0.005

💡 Example: Using in Python

from ragmint.qa_generator import generate_validation_qa

generate_validation_qa(
    docs_path="data/docs",                          # Folder with .txt documents
    output_path="experiments/validation_qa.json",   # Output JSON file
    llm_model="gemini-2.5-flash-lite",              # or "claude-3-opus-20240229"
    batch_size=5,                                   # Number of docs per LLM call
    sleep_between_batches=2,                        # Wait time between calls (seconds)
    min_q=3,                                        # Minimum questions per doc
    max_q=25                                        # Maximum questions per doc
)

✅ The generator supports both Gemini and Claude models.
Set your API key in a .env file or via environment variables:

export GOOGLE_API_KEY="your_gemini_key"
export ANTHROPIC_API_KEY="your_claude_key"

🧩 Langchain Config Adapter

Ensures consistent configuration across pipeline components.
Normalizes retriever and embedding names (e.g., faiss, sentence-transformers/...).
Adds default chunk parameters when missing.
Validates retriever backends and raises clear errors for unsupported options.

🧩 Langchain Prebuilder

Automates pipeline preparation:

Reads documents
Applies chunking
Creates embeddings
Initializes retriever / vector store
Returns ready-to-use pipeline** for RAGMint or custom usage.

🔤 Available Embeddings (Hugging Face)

You can select from the following models:

sentence-transformers/all-MiniLM-L6-v2 — lightweight, general-purpose
sentence-transformers/all-mpnet-base-v2 — higher accuracy, slower
BAAI/bge-base-en-v1.5 — multilingual, dense embeddings
intfloat/multilingual-e5-base — ideal for multilingual corpora

Configuration Example

Use the following format in your config file to specify the embedding model:

embedding_model: sentence-transformers/all-MiniLM-L6-v2

🔍 Available Retrievers

Ragmint integrates multiple retrieval backends to suit different needs:

Retriever	Description
FAISS	Fast vector similarity search; efficient for dense embeddings
Chroma	Persistent vector DB; works well for incremental updates
scikit-learn (NearestNeighbors)	Lightweight, zero-dependency local retriever

Configuration Example

To specify the retriever in your configuration file, use the following format:

retriever: faiss

🧪 Dataset Options

Ragmint can automatically load evaluation datasets for your RAG pipeline:

Mode	Example	Description
🧱 Default	`validation_set=None`	Uses built-in `experiments/validation_qa.json`
📁 Custom File	`validation_set="data/my_eval.json"`	Load your own QA dataset (JSON or CSV)
🌐 Hugging Face Dataset	`validation_set="squad"`	Automatically downloads benchmark datasets (requires `pip install datasets`)

Example

from ragmint.tuner import RAGMint

ragmint = RAGMint(
    docs_path="data/docs/",
    retrievers=["faiss", "chroma"],
    embeddings=["text-embedding-3-small"],
    rerankers=["mmr"],
)

# Use built-in default
ragmint.optimize(validation_set=None)

# Use Hugging Face benchmark
ragmint.optimize(validation_set="squad")

# Use your own dataset
ragmint.optimize(validation_set="data/custom_qa.json")

🧠 Auto-RAG Tuner

The AutoRAGTuner automatically analyzes your corpus and recommends retriever–embedding combinations based on corpus statistics (size and average document length). It also suggests multiple chunk sizes with overlaps to improve retrieval performance.

Beyond recommendations, it can run full end-to-end testing of the suggested configurations and identify the best-performing RAG setup for your dataset.

from ragmint.autotuner import AutoRAGTuner

# Initialize with your documents
tuner = AutoRAGTuner(docs_path="data/docs/")

# Recommend configurations and suggest chunk sizes
recommendation = tuner.recommend(num_chunk_pairs=5)
print("Initial recommendation:", recommendation)

# Run full auto-tuning on validation set
best_config, results = tuner.auto_tune(validation_set="data/validation.json", trials=5)
print("Best configuration after testing:", best_config)
print("All trial results:", results)

🧠 Gradio Dashboard

Ragmint includes a visual dashboard to AutoTune and analyze RAG pipelines.

Ragmint Gradio App Preview

🏆 Leaderboard Tracking

Track and visualize your best experiments across runs.

from ragmint.leaderboard import Leaderboard

# Initialize local leaderboard
leaderboard = Leaderboard(storage_path="leaderboard.jsonl")

# Retrieve top 5 runs
print("\n🏅 Top 5 Experiments:")
for result in leaderboard.top_results(limit=5):
    print(f"{result['run_id']} | Score: {result['best_score']:.2f} | Model: {result['model']}")

🧠 Explainability with Gemini / Claude

Compare RAG configurations and receive natural language insights on why one performs better.

from ragmint.autotuner import AutoRAGTuner
from ragmint.explainer import explain_results

tuner = AutoRAGTuner(docs_path="data/docs/")
best, results = tuner.auto_tune(
    validation_set='data/docs/validation_qa.json',
    metric="faithfulness",
    trials=5,
    search_type='bayesian'
)

analysis = explain_results(best, results, corpus_stats=tuner.corpus_stats)
print(analysis)

Set your API keys in a .env file or via environment variables:
export GEMINI_API_KEY="your_gemini_key"
export ANTHROPIC_API_KEY="your_claude_key"

🧩 Folder Structure

ragmint/
├── core/
│   ├── pipeline.py
│   ├── retriever.py
│   ├── reranker.py
│   ├── embeddings.py
│   ├── chunking.py
│   └── evaluation.py
├── integration/
│   ├── config_adapter.py
│   └── langchain_prebuilder.py
├── autotuner.py
├── explainer.py
├── leaderboard.py
├── tuner.py
├── utils/
├── configs/
├── experiments/
├── tests/
└── main.py

🧪 Running Tests

pytest -v

To include integration tests with Gemini or Claude APIs:

pytest -m integration

⚙️ Configuration via `pyproject.toml`

Your pyproject.toml includes all required dependencies:

[project]
name = "ragmint"
version = "0.1.0"
dependencies = [
  # Core ML + Embeddings
  "numpy<2.0.0",
  "pandas>=2.0",
  "scikit-learn>=1.3",
  "sentence-transformers>=2.2.2",

  # Retrieval backends
  "chromadb>=0.4",
  "faiss-cpu; sys_platform != 'darwin'",       # For Linux/Windows
  "faiss-cpu==1.7.4; sys_platform == 'darwin'", # Optional fix for macOS MPS
  "rank-bm25>=0.2.2",                          # For BM25 retriever

  # Optimization & evaluation
  "optuna>=3.0",
  "tqdm",
  "colorama",

  # RAG evaluation and data utils
  "pyyaml",
  "python-dotenv",

  # Explainability and LLM APIs
  "openai>=1.0.0",
  "google-generativeai>=0.8.0",
  "anthropic>=0.25.0",

  # Integration / storage
  "supabase>=2.4.0",

  # Testing
  "pytest",

  # LangChain integration layer
  "langchain>=0.2.5",
  "langchain-community>=0.2.5",
  "langchain-text-splitters>=0.2.1"
]

📊 Example Experiment Workflow

Define your retriever, embedding, and reranker setup
Launch optimization (Grid, Random, Bayesian) or AutoTune
Compare performance with explainability
Persist results to leaderboard for later inspection

🧬 Architecture Overview

flowchart TD
    A[Query] --> B[Chunking / Preprocessing]
    B --> C[Embedder]
    C --> D[Retriever]
    D --> E[Reranker]
    E --> F[Generator]
    F --> G[Evaluation]
    G --> H[AutoRAGTuner / Optuna]
    H --> I[Suggested Configs & Chunk Sizes]
    I --> J[Best Configuration]
    J -->|Update Params| C

📘 Example Output

[INFO] Starting Auto-RAG Tuning
[INFO] Suggested retriever=Chroma, embedding_model=sentence-transformers/all-MiniLM-L6-v2
[INFO] Suggested chunk-size candidates: [(380, 80), (420, 100), (350, 70), (400, 90), (360, 75)]
[INFO] Running full evaluation on validation set with 5 trials
[INFO] Trial 1 finished: faithfulness=0.82, latency=0.40s
[INFO] Trial 2 finished: faithfulness=0.85, latency=0.44s
...
[INFO] Best configuration after testing: {'retriever': 'Chroma', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'chunk_size': 400, 'overlap': 90, 'strategy': 'sentence'}

🧾 Citation

If you use Ragmint in your research, please cite:

@software{oliveira2025ragmint,
  author = {André Oliveira},
  title = {Ragmint: Retrieval-Augmented Generation Model Inspection & Tuning},
  year = {2025},
  url = {https://github.com/andyolivers/ragmint},
  license = {Apache-2.0}
}

_{Built with ❤️ by André Oliveira | Apache 2.0 License}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.12

Nov 21, 2025

0.4.11

Nov 21, 2025

This version

0.4.10

Nov 20, 2025

0.4.9

Nov 15, 2025

0.4.8

Nov 15, 2025

0.4.7

Nov 10, 2025

0.4.6

Nov 10, 2025

0.4.5

Nov 10, 2025

0.4.4

Nov 9, 2025

0.4.3

Nov 7, 2025

0.4.2

Nov 7, 2025

0.4.1

Nov 7, 2025

0.4.0

Nov 7, 2025

0.3.1

Nov 7, 2025

0.3.0

Nov 6, 2025

0.2.3

Nov 5, 2025

0.2.2

Nov 5, 2025

0.2.1

Nov 5, 2025

0.2.0

Nov 5, 2025

0.1.1

Nov 4, 2025

0.1.0

Nov 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragmint-0.4.10.tar.gz (44.4 kB view details)

Uploaded Nov 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragmint-0.4.10-py3-none-any.whl (56.4 kB view details)

Uploaded Nov 20, 2025 Python 3

File details

Details for the file ragmint-0.4.10.tar.gz.

File metadata

Download URL: ragmint-0.4.10.tar.gz
Upload date: Nov 20, 2025
Size: 44.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for ragmint-0.4.10.tar.gz
Algorithm	Hash digest
SHA256	`df2a33289617ea8e5bcb43301933a4d909b943e3f60442b704dde11ed8a3e3a5`
MD5	`84f3ab40d79991b7adb9a8ea22b9b6e5`
BLAKE2b-256	`8e9fcfe11bd7db2b345b45e5faee4f48fb1091c850387db17265e32f02abae70`

See more details on using hashes here.

File details

Details for the file ragmint-0.4.10-py3-none-any.whl.

File metadata

Download URL: ragmint-0.4.10-py3-none-any.whl
Upload date: Nov 20, 2025
Size: 56.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for ragmint-0.4.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c3f8d24cd54fa659dfc04d3561d58825077716e3c9df4d99cb567a6f39dfbdbc`
MD5	`14cb96dd7f75caed0492009b884221d3`
BLAKE2b-256	`c80856589674ee4a3dc518d4b87135c1a961a53634535d5e2f63e4073965eb0b`

See more details on using hashes here.

ragmint 0.4.10

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Ragmint

✨ Features

🚀 Quick Start

Installation

Installation via PyPI

2️⃣ Run a RAG Optimization Experiment

3️⃣ Manual Pipeline Usage

🧩 Embeddings and Retrievers

🧩 Chunking System

🧮 Validation QA Generator

✨ Key Capabilities

⚙️ Usage

💡 Example: Using in Python

🧩 Langchain Config Adapter

🧩 Langchain Prebuilder

🔤 Available Embeddings (Hugging Face)

Configuration Example

🔍 Available Retrievers

Configuration Example

🧪 Dataset Options

Example

🧠 Auto-RAG Tuner

🧠 Gradio Dashboard

🏆 Leaderboard Tracking

🧠 Explainability with Gemini / Claude

🧩 Folder Structure

🧪 Running Tests

⚙️ Configuration via pyproject.toml

📊 Example Experiment Workflow

🧬 Architecture Overview

📘 Example Output

🧾 Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

⚙️ Configuration via `pyproject.toml`