Skip to main content

RAG Evaluator is a Python library for evaluating Retrieval-Augmented Generation (RAG) systems. It provides various metrics to evaluate the quality of generated text against reference text.

Project description

RAG Evaluator

Overview

RAG Evaluator is a Python library for evaluating Retrieval-Augmented Generation (RAG) systems. It provides various metrics to evaluate the quality of generated text against reference text.

Installation

You can install the library using pip:

pip install rag-evaluate

Usage: Here's how to use the RAG Evaluator library:

from rag_evaluate import RAG_Evaluator

Initialize the evaluator

evaluator = RAG_Evaluator()

Input data

question = "What are the causes of climate change?". generated_text = "Climate change is caused by human activities.". context = "Human activities such as burning fossil fuels cause climate change.".

Evaluate the response

bleu_score = evaluator.bleu_score(question, response, reference)
rouge_score = evaluator.rouge_score(question, response, reference)
bert_score = evaluator.bert_score(question, response, reference)

Print the results

print(bleu_score)
print(rouge_score)  
print(bert_score)  

The RAG Evaluator provides the following metrics:

BLEU (0-100): Measures the overlap between the generated output and reference text based on n-grams.

0-20: Low similarity, 20-40: Medium-low, 40-60: Medium, 60-80: High, 80-100: Very high.

ROUGE-1 (0-1): Measures the overlap of unigrams between the generated output and reference text.

0.0-0.2: Poor overlap, 0.2-0.4: Fair, 0.4-0.6: Good, 0.6-0.8: Very good, 0.8-1.0: Excellent.

BERT Score (0-1): Evaluates the semantic similarity using BERT embeddings (Precision, Recall, F1).

0.0-0.5: Low similarity, 0.5-0.7: Moderate, 0.7-0.8: Good, 0.8-0.9: High, 0.9-1.0: Very high.

Contributer

Biplab Sil.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_evaluate-0.4.0.tar.gz (2.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_evaluate-0.4.0-py3-none-any.whl (2.7 kB view details)

Uploaded Python 3

File details

Details for the file rag_evaluate-0.4.0.tar.gz.

File metadata

  • Download URL: rag_evaluate-0.4.0.tar.gz
  • Upload date:
  • Size: 2.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for rag_evaluate-0.4.0.tar.gz
Algorithm Hash digest
SHA256 2b14c7b3b9e67b0df55e75d404eb0ab0f8c91936aaee2b638c02d079301750ef
MD5 103a1413d3011c05e6f4f82206fe486f
BLAKE2b-256 c344f94090abf32cee7b8b0de6b3283059beb04042699b8656cf99b2788fb25b

See more details on using hashes here.

File details

Details for the file rag_evaluate-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: rag_evaluate-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 2.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for rag_evaluate-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 30d888a2663745bce55dbbdd469de0f747dbe33e8ea83d61bec66e67231be515
MD5 aab04137f4fa6f9007c0c96955edb1f6
BLAKE2b-256 95cc7f2fc76e8026bc162f9c0a58510a4956d8597d23fd96f967fd85c9b3bd85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page