RAG Evaluation Framework using Ragas metrics and MLflow tracking
Project description
RAGSentinel
RAG Evaluation Framework using Ragas metrics and MLflow tracking.
Installation
1. Create Virtual Environment
# Create project directory
mkdir my-rag-eval
cd my-rag-eval
# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
2. Install Package
pip install rag-sentinel
Quick Start
1. Initialize Project
rag-sentinel init
This creates:
.env- LLM/Embeddings API keysconfig.ini- App settings and authenticationrag_eval_config.yaml- Master configurationtest_dataset.csv- Sample test dataset
2. Configure
Edit .env file:
- Add your LLM API keys (Azure OpenAI, OpenAI, or Ollama credentials)
- Set API endpoints and deployment names
Edit config.ini file:
- Set your RAG backend URL in
app_url - Set API paths for context and answer endpoints (
context_api_path,answer_api_path) - Configure authentication (cookie, bearer token, or API key)
- Set MLflow tracking URI (default:
http://127.0.0.1:5000)
Edit test_dataset.csv file:
- Add your test queries in format:
query,ground_truth,chat_id - Example:
What is RAG?,RAG stands for Retrieval-Augmented Generation,1
For detailed configuration help, see the comments in each config file.
3. Validate & Run
# Validate configuration
rag-sentinel validate
# Run evaluation
rag-sentinel run
Results will be available in the MLflow UI at the configured tracking URI.
CLI Commands
# Initialize new project
rag-sentinel init
# Validate configuration
rag-sentinel validate
# Run evaluation (auto-starts MLflow)
rag-sentinel run
# Run without starting MLflow server
rag-sentinel run --no-server
# Overwrite existing config files
rag-sentinel init --force
# Check package version
pip show rag-sentinel
# Upgrade to latest version
pip install --upgrade rag-sentinel
Evaluation Categories
Set category in config.ini to choose evaluation type:
Simple (RAGAS Quality Metrics)
category = simple
- Faithfulness - Factual consistency of answer with context
- Answer Relevancy - How relevant the answer is to the question
- Context Precision - Quality of retrieved context
- Answer Correctness - Comparison against ground truth
Guardrail (Security Metrics)
category = guardrail
- Toxicity Score - Detects toxic content in responses
- Bias Score - Detects biased content in responses
Performance Metrics
Logged for all evaluation categories:
- Avg Response Time - Average API response time (ms)
- P90 Latency - 90th percentile latency (ms)
- Queries Per Second - Throughput (QPS)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rag_sentinel-0.1.8.tar.gz.
File metadata
- Download URL: rag_sentinel-0.1.8.tar.gz
- Upload date:
- Size: 16.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71eb154f67de0e58fc4c0f972b3af946a65d8001f465c3b517416683a089b6f8
|
|
| MD5 |
cc8e9d7d7337924c4b063f5c9495beda
|
|
| BLAKE2b-256 |
96d73903d3125165262efb05fb1481efaac79839b912c4c8b8908b4341d75048
|
File details
Details for the file rag_sentinel-0.1.8-py3-none-any.whl.
File metadata
- Download URL: rag_sentinel-0.1.8-py3-none-any.whl
- Upload date:
- Size: 18.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6653798c3d2aa34fe678e59750ee51938a51ec3933462e05f89d534beda2aaa
|
|
| MD5 |
2918c4f78e11af86b88ce1f6e4d13d73
|
|
| BLAKE2b-256 |
7776f3e3f5e26b9df294b137a1e7192c91b09a3c24f01a16eed04fc3a0b201aa
|