A CLI tool that automates RAG hyperparameter optimization using Bayesian search and synthetic data generation
Project description
AutoRAG-Optim
Stop guessing your RAG configuration. Let AutoRAG find the optimal one for your data.
AutoRAG-Optim is a CLI tool that automatically discovers the best RAG (Retrieval-Augmented Generation) hyperparameters for your specific database. Instead of manually testing hundreds of parameter combinations, run one command and get a production-ready configuration optimized for your data.
Why AutoRAG?
Most teams waste weeks manually tuning RAG settingsโchunk sizes, embedding models, retrieval countsโwithout knowing what actually works best for their data. AutoRAG solves this by:
- Generating synthetic test data from your documents (no manual labeling needed)
- Intelligently searching the configuration space (20-30 experiments instead of 1000+)
- Evaluating with real metrics (accuracy, faithfulness, relevancy, recall)
- Running entirely locally (ChromaDB for vectorsโno Pinecone API key required)
Typical results: 30-40% cost reduction and 20-35% accuracy improvement over default settings.
Features
| Feature | Description |
|---|---|
| ๐Smart Optimization | Bayesian or Grid Search to find optimal parameters in 20-30 experiments |
| โกTwo-Phase Architecture | Expensive indexing params tested separately from fast query params |
| ๐5 Tunable Parameters | chunk_size, chunk_overlap, embedding_model, top_k, temperature |
| ๐คSynthetic Q&A Generation | Auto-generate test questions from your documents using LLM |
| ๐RAGAS-like Evaluation | Measure accuracy, faithfulness, relevancy, and context recall |
| ๐๏ธLocal Vector Store | ChromaDB runs locallyโno external API keys needed |
| ๐Multi-Database Support | Supabase Storage, MongoDB, PostgreSQL |
| ๐ง Multi-LLM Support | Groq, OpenAI, OpenRouter |
| ๐Rich CLI Output | Beautiful terminal output with progress bars, tables, and HTML reports |
Installation
pip install autorag-optim
For RAGAS evaluation (optional):
pip install autorag-optim[ragas]
Quick Start
1. Create Configuration
Create a config.yaml file:
database:
type: supabase
url: https://your-project.supabase.co
key: your-supabase-anon-key
bucket: pdf
folder: pdf
llm:
provider: groq
model: null # Uses default: llama-3.3-70b-versatile
api_keys:
groq: your-groq-api-key
rag:
chunk_size: [256, 512, 1024]
chunk_overlap: [50, 100]
embedding_model:
- all-MiniLM-L6-v2
top_k: [3, 5, 10]
temperature: [0.3, 0.7]
optimization:
strategy: bayesian # or: grid
num_experiments: 20
test_questions: 50
evaluation:
method: custom # or: ragas
2. Run Optimization
autorag optimize --config config.yaml
3. View Results
autorag results --show-report
Configuration Options
Optimization Strategy
| Strategy | Description | Best For |
|---|---|---|
bayesian |
Intelligent search using Optuna TPE sampler | Default choiceโfinds good configs with fewer experiments |
grid |
Systematic search with stratified sampling | Guaranteed coverage of search space |
Evaluation Method
| Method | Description | Notes |
|---|---|---|
custom |
Built-in token-optimized evaluator | Works with any LLM, fast, no extra dependencies |
ragas |
Official RAGAS library metrics | Requires pip install ragas, uses OpenAI-compatible API |
LLM Providers
| Provider | Default Model | Notes |
|---|---|---|
groq |
llama-3.3-70b-versatile |
Fast inference, generous free tier |
openai |
gpt-4o-mini |
High quality, production-ready |
openrouter |
meta-llama/llama-3.3-70b-instruct |
Access to 100+ models |
Database Connectors
| Type | Description | Config Fields |
|---|---|---|
supabase |
Supabase Storage bucket | url, key, bucket, folder |
mongodb |
MongoDB collection | connection_string, database, collection |
postgresql |
PostgreSQL table | host, port, database, table, user, password |
Estimated API Calls & Runtime
Understanding the cost before running optimization:
Formula
LLM Calls โ Q&A Generation + (Experiments ร Questions ร Calls per Question)
Where:
- Q&A Generation = ceil(test_questions / 2) [~1 call per 2 questions]
- Calls per Question = 1 (RAG query) + 3 (evaluation) = 4 calls
Estimates by Configuration
| Questions | Experiments | LLM Calls | Est. Time* |
|---|---|---|---|
| 20 | 10 | ~810 | 15-30ย min |
| 50 | 20 | ~4,025 | 45-60 min |
| 50 | 30 | ~6,025 | 60-90 min |
| 100 | 20 | ~8,050 | 100-150ย min |
*Time varies based on LLM provider rate limits and response times. Groq is typically fastest.
Cost Saving Tips
- Start with fewer experiments (10-15) to validate your setup
- Use
bayesianstrategyโit finds good configs with 30-40% fewer experiments than grid search - Reduce
test_questionsfor initial exploration (20-30 is enough to rank configs)
How It Works
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. CONNECT โ
โ Fetch documents from your database (Supabase/Mongo/PG) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 2. GENERATE โ
โ Create synthetic Q&A pairs from your documents using LLM โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 3. OPTIMIZE (Two-Phase) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ OUTER LOOP: Indexing params (expensive) โ โ
โ โ โ chunk_size, chunk_overlap, embedding_model โ โ
โ โ โ Requires re-indexing documents โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ INNER LOOP: Query params (fast) โ โ
โ โ โ top_k, temperature โ โ
โ โ โ Same index, just different retrieval settings โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 4. EVALUATE โ
โ Score each config: relevancy, faithfulness, similarity, โ
โ context recall โ weighted aggregate score โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 5. REPORT โ
โ Terminal table + JSON + HTML report with best config โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
CLI Commands
| Command | Description |
|---|---|
autorag optimize |
Run RAG optimization on your database |
autorag results |
Display optimization results |
autorag status |
Check optimization progress (async mode) |
autorag optimize --help
Options:
-c, --config PATH Path to config file (default: config.yaml)
--async Run optimization in background
Evaluation Metrics
| Metric | What It Measures |
|---|---|
| Answer Relevancy | Is the answer relevant to the question asked? |
| Faithfulness | Is the answer grounded in the retrieved context? |
| Answer Similarity | How similar is the generated answer to ground truth? |
| Context Recall | Does the retrieved context contain the required information? |
Development
# Clone repository
git clone https://github.com/vatsalpjain/autorag-optim.git
cd autorag-optim
# Install with dev dependencies
uv sync --extra dev
# Run CLI
uv run autorag --help
# Run tests
uv run pytest tests/ -v
Requirements
- Python 3.10+
- LLM API key (Groq, OpenAI, or OpenRouter)
- Database (Supabase, MongoDB, or PostgreSQL)
- No Pinecone requiredโuses local ChromaDB
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autorag_optim-0.1.1.tar.gz.
File metadata
- Download URL: autorag_optim-0.1.1.tar.gz
- Upload date:
- Size: 450.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b188ae048391ffdf95c724af3c9a79f7cef91047f4228e12372e20875304a15a
|
|
| MD5 |
c544244fe19b93124f746a6f4ef7c9b5
|
|
| BLAKE2b-256 |
4d4dfc80a835ab7d337af7d5b6caa445adafe086715eacc22b13a99815e3f29a
|
File details
Details for the file autorag_optim-0.1.1-py3-none-any.whl.
File metadata
- Download URL: autorag_optim-0.1.1-py3-none-any.whl
- Upload date:
- Size: 60.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c36fb61d58d434850bdf315dcaefd14cfced31f52c26f3d9103c610c799fdfa0
|
|
| MD5 |
f6a31571ed5783151111de53080c631f
|
|
| BLAKE2b-256 |
3cbe00e0a13282bcb5c928cfeaa0b1f061410b4c010f0778c7ef919e4e16d233
|