Skip to main content

A CLI tool that automates RAG hyperparameter optimization using Bayesian search and synthetic data generation

Project description

AutoRAG-Optim

Automatically find the optimal RAG configuration for your database.

Python 3.10+ License: MIT

AutoRAG-Optim is a CLI tool that automates RAG (Retrieval-Augmented Generation) hyperparameter optimization. Connect your database, run optimization, and get the best RAG configuration in minutes to hours.

Features

  • 🔍 Automated Optimization - Bayesian or Grid Search to find optimal RAG parameters
  • 📊 5 Configurable Parameters - chunk_size, chunk_overlap, embedding_model, top_k, temperature
  • 🗄️ Local Vector Store - ChromaDB (no API key needed, runs locally)
  • 📝 Synthetic Q&A Generation - Auto-generate test questions from your documents
  • 📈 RAGAS-like Metrics - Evaluate accuracy, faithfulness, relevancy, and context recall
  • 🗄️ Multi-Database Support - Supabase, MongoDB, PostgreSQL
  • 🤖 Multi-LLM Support - Groq, OpenAI, OpenRouter
  • 📋 Rich CLI Output - Beautiful terminal output with progress bars and tables

Installation

# Clone the repository
git clone https://github.com/yourusername/autorag-optim.git
cd autorag-optim

# Install with uv
uv sync

Quick Start

1. Create Configuration

Create a config.yaml file:

database:
  type: supabase
  url: https://your-project.supabase.co
  key: your-supabase-anon-key
  bucket: pdf
  folder: pdf

llm:
  provider: groq
  model: null  # Uses default: llama-3.3-70b-versatile

api_keys:
  groq: your-groq-api-key

# RAG Parameter Search Space (all parameters are lists to search over)
rag:
  chunk_size: [256, 500, 1024]
  chunk_overlap: [25, 50, 100]
  embedding_model:
    - all-MiniLM-L6-v2
  top_k: [3, 5, 10]
  temperature: [0.3, 0.7, 1.0]

optimization:
  strategy: bayesian
  num_experiments: 20
  test_questions: 50

evaluation:
  method: custom

2. Run Optimization

autorag optimize --config config.yaml

3. View Results

autorag results --show-report

CLI Commands

Command Description
autorag optimize Run RAG optimization on your database
autorag results Display optimization results
autorag status Check optimization progress

Options

autorag optimize --help

Options:
  -c, --config PATH   Path to config file (default: config.yaml)
  --async             Run optimization in background

Configuration

RAG Parameters

AutoRAG optimizes 5 RAG parameters in a two-phase architecture:

Indexing Parameters (require re-indexing, tested in outer loop):

rag:
  chunk_size: [256, 500, 1024]     # Characters per chunk
  chunk_overlap: [25, 50, 100]     # Overlap between chunks
  embedding_model:                  # HuggingFace model names
    - all-MiniLM-L6-v2
    - all-mpnet-base-v2

Query Parameters (fast, tested in inner loop):

rag:
  top_k: [3, 5, 10]                # Documents to retrieve
  temperature: [0.3, 0.7, 1.0]     # LLM creativity (0-2)

Database Options

Supabase (Storage Bucket)

database:
  type: supabase
  url: https://xxx.supabase.co
  key: your-key
  bucket: pdf
  folder: pdf

MongoDB

database:
  type: mongodb
  connection_string: mongodb://localhost:27017
  database: your_db
  collection: documents

PostgreSQL

database:
  type: postgresql
  host: localhost
  port: 5432
  database: your_db
  table: documents
  user: username
  password: password

LLM Providers

llm:
  provider: groq      # groq | openai | openrouter
  model: null         # null = use provider default

api_keys:
  groq: sk-xxx        # Required if provider=groq
  openai: sk-xxx      # Required if provider=openai
  openrouter: sk-xxx  # Required if provider=openrouter

Evaluation Methods

evaluation:
  method: custom   # custom | ragas
  • custom (default): Built-in token-optimized evaluator
  • ragas: Official RAGAS library (requires pip install ragas)

How It Works

  1. Connect - Fetches documents from your database
  2. Display Search Space - Shows all RAG parameter combinations to test
  3. Generate - Creates synthetic Q&A pairs using LLM
  4. Optimize - Two-phase optimization:
    • Outer loop: Tests indexing parameters (chunk_size, overlap, embedding_model)
    • Inner loop: Tests query parameters (top_k, temperature) on each index
  5. Evaluate - Measures accuracy using RAGAS-like metrics
  6. Report - Shows best configuration with all parameters

Vector Store

AutoRAG uses ChromaDB for local vector storage:

  • No API key required - Runs entirely locally
  • Automatic dimension detection - Works with any embedding model
  • Persistent storage - Vectors saved in .autorag_cache/
  • Dynamic collections - Separate index for each config (e.g., autorag_c500_o50_minilm)

Metrics

Metric Description
Answer Relevancy How relevant is the answer to the question?
Faithfulness Is the answer grounded in retrieved context?
Answer Similarity How similar is the answer to ground truth?
Context Recall Does the context contain the required info?

Development

# Clone repository
git clone https://github.com/yourusername/autorag-optim.git
cd autorag-optim

# Install with uv
uv sync --extra dev

# Run CLI
uv run autorag --help

# Run tests
uv run pytest tests/ -v

Requirements

  • Python 3.10+
  • LLM API key (Groq, OpenAI, or OpenRouter)
  • Database (Supabase, MongoDB, or PostgreSQL)
  • No Pinecone required - Uses local ChromaDB

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autorag_optim-0.1.0.tar.gz (447.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autorag_optim-0.1.0-py3-none-any.whl (59.1 kB view details)

Uploaded Python 3

File details

Details for the file autorag_optim-0.1.0.tar.gz.

File metadata

  • Download URL: autorag_optim-0.1.0.tar.gz
  • Upload date:
  • Size: 447.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for autorag_optim-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3761673324fd2a42a518cf9f36c0e6efe8629186a8184300530f30fc517e50b8
MD5 1596a850a151972cfb92cc2ca987e195
BLAKE2b-256 8c661532f472d2ba78b52fc12894179ad622146f216400dcd11c60f1cafdea94

See more details on using hashes here.

File details

Details for the file autorag_optim-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: autorag_optim-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 59.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for autorag_optim-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 83b2dbdefa236063a3646d5b828be4afc7929d554d9cbff49049fb55449766d8
MD5 bcc27ae155344956c7fa7549021d1802
BLAKE2b-256 319913dbdd3444c0c62f7e06c115d6c23f52a350d714bb456014c7c9d552f398

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page