Skip to main content

RAG Evaluation Framework using Ragas metrics and MLflow tracking

Project description

RAGSentinel

RAG Evaluation Framework using Ragas metrics and MLflow tracking.

Installation

1. Create Virtual Environment

# Create project directory
mkdir my-rag-eval
cd my-rag-eval

# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

2. Install Package

pip install rag-sentinel

Quick Start

1. Initialize Project

rag-sentinel init

This creates:

  • .env - LLM/Embeddings API keys
  • config.ini - App settings and authentication
  • rag_eval_config.yaml - Master configuration
  • test_dataset.csv - Sample test dataset

2. Configure

Edit .env file:

  • Add your LLM API keys (Azure OpenAI, OpenAI, or Ollama credentials)
  • Set API endpoints and deployment names

Edit config.ini file:

  • Set your RAG backend URL in app_url
  • Set API paths for context and answer endpoints (context_api_path, answer_api_path)
  • Configure authentication (cookie, bearer token, or API key)
  • Set MLflow tracking URI (default: http://127.0.0.1:5000)

Edit test_dataset.csv file:

  • Add your test queries in format: query,ground_truth,chat_id
  • Example: What is RAG?,RAG stands for Retrieval-Augmented Generation,1

For detailed configuration help, see the comments in each config file.

3. Validate & Run

# Validate configuration
rag-sentinel validate

# Run evaluation
rag-sentinel run

Results will be available in the MLflow UI at the configured tracking URI.

CLI Commands

# Initialize new project
rag-sentinel init

# Validate configuration
rag-sentinel validate

# Run evaluation (auto-starts MLflow)
rag-sentinel run

# Run without starting MLflow server
rag-sentinel run --no-server

# Overwrite existing config files
rag-sentinel init --force

# Check package version
pip show rag-sentinel

# Upgrade to latest version
pip install --upgrade rag-sentinel

Evaluation Categories

Set category in config.ini to choose evaluation type:

Simple (RAGAS Quality Metrics)

category = simple
  • Faithfulness - Factual consistency of answer with context
  • Answer Relevancy - How relevant the answer is to the question
  • Context Precision - Quality of retrieved context
  • Answer Correctness - Comparison against ground truth

Guardrail (Security Metrics)

category = guardrail
  • Toxicity Score - Detects toxic content in responses
  • Bias Score - Detects biased content in responses

Performance Metrics

Logged for all evaluation categories:

  • Avg Response Time - Average API response time (ms)
  • P90 Latency - 90th percentile latency (ms)
  • Queries Per Second - Throughput (QPS)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_sentinel-0.1.8.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_sentinel-0.1.8-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file rag_sentinel-0.1.8.tar.gz.

File metadata

  • Download URL: rag_sentinel-0.1.8.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for rag_sentinel-0.1.8.tar.gz
Algorithm Hash digest
SHA256 71eb154f67de0e58fc4c0f972b3af946a65d8001f465c3b517416683a089b6f8
MD5 cc8e9d7d7337924c4b063f5c9495beda
BLAKE2b-256 96d73903d3125165262efb05fb1481efaac79839b912c4c8b8908b4341d75048

See more details on using hashes here.

File details

Details for the file rag_sentinel-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: rag_sentinel-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for rag_sentinel-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 f6653798c3d2aa34fe678e59750ee51938a51ec3933462e05f89d534beda2aaa
MD5 2918c4f78e11af86b88ce1f6e4d13d73
BLAKE2b-256 7776f3e3f5e26b9df294b137a1e7192c91b09a3c24f01a16eed04fc3a0b201aa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page