Skip to main content

RAG Evaluator is a Python library for evaluating Retrieval-Augmented Generation (RAG) systems. It provides various metrics to evaluate the quality of generated text against reference text.

Project description

RAG Evaluator

Overview

RAG Evaluator is a Python library for evaluating Retrieval-Augmented Generation (RAG) systems. It provides various metrics to evaluate the quality of generated text against reference text.

Installation

You can install the library using pip:

pip install rag-evaluate

Usage: Here's how to use the RAG Evaluator library:

from rag_evaluate import RAG_Evaluator

Initialize the evaluator

evaluator = RAG_Evaluator()

Input data

question = "What are the causes of climate change?." generated_text = "Climate change is caused by human activities." context = "Human activities such as burning fossil fuels cause climate change."

Evaluate the response

bleu_score = evaluator.bleu_score(question, generated_text, context)
rouge_score = evaluator.rouge_score(question, generated_text, context)
bert_score = evaluator.bert_score(question, generated_text, context)

Print the results

print(bleu_score)
print(rouge_score)  
print(bert_score)  

The RAG Evaluator provides the following metrics:

BLEU (0-100): Measures the overlap between the generated output and reference text based on n-grams.

0-20: Low similarity, 20-40: Medium-low, 40-60: Medium, 60-80: High, 80-100: Very high.

ROUGE-1 (0-1): Measures the overlap of unigrams between the generated output and reference text.

0.0-0.2: Poor overlap, 0.2-0.4: Fair, 0.4-0.6: Good, 0.6-0.8: Very good, 0.8-1.0: Excellent.

BERT Score (0-1): Evaluates the semantic similarity using BERT embeddings (Precision, Recall, F1).

0.0-0.5: Low similarity, 0.5-0.7: Moderate, 0.7-0.8: Good, 0.8-0.9: High, 0.9-1.0: Very high.

Contributer

Biplab Sil.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_evaluate-0.5.0.tar.gz (2.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_evaluate-0.5.0-py3-none-any.whl (2.7 kB view details)

Uploaded Python 3

File details

Details for the file rag_evaluate-0.5.0.tar.gz.

File metadata

  • Download URL: rag_evaluate-0.5.0.tar.gz
  • Upload date:
  • Size: 2.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for rag_evaluate-0.5.0.tar.gz
Algorithm Hash digest
SHA256 360c3683ef4b303aca26885538ff2b41759b574d40bed1bab7dfddbcd6842949
MD5 03c74c61558bede16becba37ac0e9e6c
BLAKE2b-256 dd7be0e26464162454442d06ddd7b97cacc0586e898045c05e1f3047be5ae6a0

See more details on using hashes here.

File details

Details for the file rag_evaluate-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: rag_evaluate-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 2.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for rag_evaluate-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3ccca3867ed727fc98440525783978ac2ce701deb65a5f761cbce618ea507e2
MD5 e9c09ba978bcb3bbc9733ca77867f067
BLAKE2b-256 5946416e08b3cd2cc605e49d253fca33f161d4e75d0dddf09424995306cafae3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page