Skip to main content

RAGAS-based evaluation pipeline with trust-specific metrics for TrustRAG

Project description

trustrag-eval

RAGAS-based evaluation pipeline with trust-specific metrics for TrustRAG.

Getting Started

# Install in development mode
pip install -e packages/trustrag-eval

# Run tests
cd packages/trustrag-eval && pytest tests/ -v

Metrics Explained

RAGAS Metrics (industry-standard)

Metric What it measures
Faithfulness Does the answer stay within retrieved context?
Answer Relevancy Does the answer address the question?
Context Precision How relevant are the retrieved chunks?
Context Recall Did we retrieve the chunks needed for ground truth?

Trust-Specific Metrics (TrustRAG)

Metric What it measures
Trust Score Distribution (p25/p50/p75/mean) Overall trust score health across queries
Flagged Rate Percentage of queries with trust_score < 50
Hit@5 Was the ground-truth chunk in the top-5 retrieved?
Hit@5 by Category Hit rate broken down by query type (semantic/keyword/hybrid)

Usage

from trustrag_eval import load_synthetic_queries, compute_trust_metrics

# Load benchmark dataset
queries = load_synthetic_queries("eval/synthetic_queries.json")

# Compute trust distribution from results
results = [{"trust_score": 85}, {"trust_score": 72}, ...]
dist = compute_trust_metrics(results)
print(f"Median trust: {dist['p50']}, Flagged: {dist['flagged_pct']:.0%}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trustrag_eval-0.1.0.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trustrag_eval-0.1.0-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file trustrag_eval-0.1.0.tar.gz.

File metadata

  • Download URL: trustrag_eval-0.1.0.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for trustrag_eval-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0bfe9cb080e8f299048568dcc8cb949806919496f6f91c39c787a873f4a022df
MD5 adefb0ad6197ec1801ac2544e75401fc
BLAKE2b-256 ba8cce6f9543070ffd926dbbe58f030229d178eab1f3147416a5b01458868fc8

See more details on using hashes here.

File details

Details for the file trustrag_eval-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: trustrag_eval-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for trustrag_eval-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ba0049877ba3a7f065eb5eb420c3fca457317ee23bd9ad17364e12734a261719
MD5 d6cfdd3802351a414bf71f274604c6cd
BLAKE2b-256 abbc0d565967d383d5d4862216857e7b5d7c5ce99c57ff5f112d7bd3ffbd9d37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page