A CLI tool that automates RAG hyperparameter optimization using Bayesian search and synthetic data generation

These details have not been verified by PyPI

Project links

Project description

AutoRAG-Optim

Stop guessing your RAG configuration. Let AutoRAG find the optimal one for your data.

AutoRAG-Optim is a CLI tool that automatically discovers the best RAG (Retrieval-Augmented Generation) hyperparameters for your specific database. Instead of manually testing hundreds of parameter combinations, run one command and get a production-ready configuration optimized for your data.

Why AutoRAG?

Most teams waste weeks manually tuning RAG settings—chunk sizes, embedding models, retrieval counts—without knowing what actually works best for their data. AutoRAG solves this by:

Generating synthetic test data from your documents (no manual labeling needed)
Intelligently searching the configuration space (20-30 experiments instead of 1000+)
Evaluating with real metrics (accuracy, faithfulness, relevancy, recall)
Running entirely locally (ChromaDB for vectors—no Pinecone API key required)

Typical results: 30-40% cost reduction and 20-35% accuracy improvement over default settings.

⚠️ API Cost Warning
AutoRAG makes many LLM API calls during optimization. Ensure you have sufficient API credits.
Default settings (5 questions, 5 experiments) ≈ 100 API calls. Larger runs can use 1000s of calls.

Features

Feature	Description
🔍Smart Optimization	Bayesian or Grid Search to find optimal parameters in 20-30 experiments
⚡Two-Phase Architecture	Expensive indexing params tested separately from fast query params
📊5 Tunable Parameters	`chunk_size`, `chunk_overlap`, `embedding_model`, `top_k`, `temperature`
🤖Synthetic Q&A Generation	Auto-generate test questions from your documents using LLM
📈RAGAS-like Evaluation	Measure accuracy, faithfulness, relevancy, and context recall
🗄️Local Vector Store	ChromaDB runs locally—no external API keys needed
🔌Multi-Database Support	Supabase Storage, MongoDB, PostgreSQL
🧠Multi-LLM Support	Groq, OpenAI, OpenRouter
📋Rich CLI Output	Beautiful terminal output with progress bars, tables, and HTML reports

Installation

pip install autorag-optim

For RAGAS evaluation (optional):

pip install autorag-optim[ragas]

Quick Start

1. Create Configuration

Create a config.yaml file:

database:
  type: supabase
  url: https://your-project.supabase.co
  key: your-supabase-anon-key
  bucket: pdf
  folder: pdf

llm:
  provider: groq
  model: null  # Uses default: llama-3.3-70b-versatile

api_keys:
  groq: your-groq-api-key

rag:
  chunk_size: [256, 512, 1024]
  chunk_overlap: [50, 100]
  embedding_model:
    - all-MiniLM-L6-v2
  top_k: [3, 5, 10]
  temperature: [0.3, 0.7]

optimization:
  strategy: bayesian    # or: grid
  num_experiments: 20
  test_questions: 50

evaluation:
  method: custom        # or: ragas

2. Run Optimization

autorag optimize --config config.yaml

3. View Results

autorag results --show-report

Configuration Options

Optimization Strategy

Strategy	Description	Best For
`bayesian`	Intelligent search using Optuna TPE sampler	Default choice—finds good configs with fewer experiments
`grid`	Systematic search with stratified sampling	Guaranteed coverage of search space

Evaluation Method

Method	Description	Notes
`custom`	Built-in token-optimized evaluator	Works with any LLM, fast, no extra dependencies
`ragas`	Official RAGAS library metrics	Requires `pip install ragas`, uses OpenAI-compatible API

LLM Providers

Provider	Default Model	Notes
`groq`	`llama-3.3-70b-versatile`	Fast inference, generous free tier
`openai`	`gpt-4o-mini`	High quality, production-ready
`openrouter`	`meta-llama/llama-3.3-70b-instruct`	Access to 100+ models

Database Connectors

Type	Description	Config Fields
`supabase`	Supabase Storage bucket	`url`, `key`, `bucket`, `folder`
`mongodb`	MongoDB collection	`connection_string`, `database`, `collection`
`postgresql`	PostgreSQL table	`host`, `port`, `database`, `table`, `user`, `password`

Estimated API Calls & Runtime

Understanding the cost before running optimization:

Formula

LLM Calls ≈ Q&A Generation + (Experiments × Questions × Calls per Question)

Where:
- Q&A Generation = ceil(test_questions / 2)  [~1 call per 2 questions]
- Calls per Question = 1 (RAG query) + 3 (evaluation) = 4 calls

Estimates by Configuration

Questions	Experiments	LLM Calls	Est. Time*
20	10	~810	15-30 min
50	20	~4,025	45-60 min
50	30	~6,025	60-90 min
100	20	~8,050	100-150 min

*Time varies based on LLM provider rate limits and response times. Groq is typically fastest.

Cost Saving Tips

Start with fewer experiments (10-15) to validate your setup
Use bayesian strategy—it finds good configs with 30-40% fewer experiments than grid search
Reduce test_questions for initial exploration (20-30 is enough to rank configs)

How It Works

┌─────────────────────────────────────────────────────────────────┐
│  1. CONNECT                                                     │
│     Fetch documents from your database (Supabase/Mongo/PG)      │
├─────────────────────────────────────────────────────────────────┤
│  2. GENERATE                                                    │
│     Create synthetic Q&A pairs from your documents using LLM    │
├─────────────────────────────────────────────────────────────────┤
│  3. OPTIMIZE (Two-Phase)                                        │
│     ┌─────────────────────────────────────────────────────┐     │
│     │ OUTER LOOP: Indexing params (expensive)             │     │
│     │   → chunk_size, chunk_overlap, embedding_model      │     │
│     │   → Requires re-indexing documents                  │     │
│     └─────────────────────────────────────────────────────┘     │
│     ┌─────────────────────────────────────────────────────┐     │
│     │ INNER LOOP: Query params (fast)                     │     │
│     │   → top_k, temperature                              │     │
│     │   → Same index, just different retrieval settings   │     │
│     └─────────────────────────────────────────────────────┘     │
├─────────────────────────────────────────────────────────────────┤
│  4. EVALUATE                                                    │
│     Score each config: relevancy, faithfulness, similarity,     │
│     context recall → weighted aggregate score                   │
├─────────────────────────────────────────────────────────────────┤
│  5. REPORT                                                      │
│     Terminal table + JSON + HTML report with best config        │
└─────────────────────────────────────────────────────────────────┘

CLI Commands

Command	Description
`autorag optimize`	Run RAG optimization on your database
`autorag results`	Display optimization results
`autorag status`	Check optimization progress (async mode)

autorag optimize --help

Options:
  -c, --config PATH   Path to config file (default: config.yaml)
  --async             Run optimization in background

Evaluation Metrics

Metric	What It Measures
Answer Relevancy	Is the answer relevant to the question asked?
Faithfulness	Is the answer grounded in the retrieved context?
Answer Similarity	How similar is the generated answer to ground truth?
Context Recall	Does the retrieved context contain the required information?

Development

# Clone repository
git clone https://github.com/vatsalpjain/autorag-optim.git
cd autorag-optim

# Install with dev dependencies
uv sync --extra dev

# Run CLI
uv run autorag --help

# Run tests
uv run pytest tests/ -v

Requirements

Python 3.10+
LLM API key (Groq, OpenAI, or OpenRouter)
Database (Supabase, MongoDB, or PostgreSQL)
No Pinecone required—uses local ChromaDB

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

MIT License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.3

Jan 10, 2026

0.1.1

Jan 10, 2026

0.1.0

Jan 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autorag_optim-0.1.3.tar.gz (450.3 kB view details)

Uploaded Jan 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autorag_optim-0.1.3-py3-none-any.whl (60.6 kB view details)

Uploaded Jan 10, 2026 Python 3

File details

Details for the file autorag_optim-0.1.3.tar.gz.

File metadata

Download URL: autorag_optim-0.1.3.tar.gz
Upload date: Jan 10, 2026
Size: 450.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for autorag_optim-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`3be28c0727bb8c5344b68a954c2d94725398962cff38d26543a9199b3da7fe3b`
MD5	`741fb7f071ab420c0300d456dc8bfda8`
BLAKE2b-256	`785ed6cd513d8fc9b79d956ce43bb487cfac905013844d15d9b0d4afd8f9a489`

See more details on using hashes here.

File details

Details for the file autorag_optim-0.1.3-py3-none-any.whl.

File metadata

Download URL: autorag_optim-0.1.3-py3-none-any.whl
Upload date: Jan 10, 2026
Size: 60.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for autorag_optim-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4dcbf919d24a93bd5bb75c3d0032629cce128cc0bc8f4f8154dce5ef5ddf2a3d`
MD5	`a5944da8b29074bde311fec998fd96f2`
BLAKE2b-256	`6b475769393187129b9eb0b0e26bf0c75b14bd4511fc1a6f47b86fd4f85bbd2f`

See more details on using hashes here.

autorag-optim 0.1.3

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

AutoRAG-Optim

Why AutoRAG?

Features

Installation

Quick Start

1. Create Configuration

2. Run Optimization

3. View Results

Configuration Options

Optimization Strategy

Evaluation Method

LLM Providers

Database Connectors

Estimated API Calls & Runtime

Formula

Estimates by Configuration

Cost Saving Tips

How It Works

CLI Commands

Evaluation Metrics

Development

Requirements

Contributing

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes