Skip to main content

LangRAGEval is a library for evaluating responses based on faithfulness, context recall, answer relevancy, and context relevancy.

Project description

LangGPTEval

Python 3.11+ PyPi version License: MIT

LangGPTEval, Evaluation library designed for Retrieval-Augmented Generation (RAG) responses.Evaluates the faithfulness, context recall, answer relevancy, and context relevancy of responses generated by various models, including OpenAI, Azure, and custom models. With a complex architecture and advanced Pydantic validation, LangGPTEval ensures reliable and accurate evaluation metrics.

🌟 Introduction

LangGPTEval is designed to evaluate the quality of responses generated by RAG models. It supports multiple metrics for evaluation:

  • Faithfulness: How true the response is to the given context.
  • Context Recall: How well the response recalls the given context.
  • Answer Relevancy: How relevant the response is to the question.
  • Context Relevancy: How relevant the response is to the context.

LangGPTEval is highly customizable, allowing users to plug in their models and tailor the evaluation to their specific needs.

🛠️ Installation

You can install LangGPTEval using pip:

pip install LangGPTEval

⚡ Quick Start

Here’s a quick start guide to get you up and running with LangGPTEval.

  1. Install the library.
  2. Prepare your data.
  3. Evaluate your model.

📚 Usage

Importing the Library

First, import the necessary components from the LangGPTEval library.

from LangGPTEval.models import EvaluationInput, ContextData
from LangGPTEval.evaluation import evaluate_faithfulness, evaluate_context_recall, evaluate_answer_relevancy, evaluate_context_relevancy
from langchain.llms import OpenAI

Setting Up Your Model

Create an instance of your model. Here, we demonstrate using LangChain’s OpenAI model.

class LangChainOpenAIModel:
    def __init__(self, api_key: str):
        self.llm = OpenAI(api_key=api_key)

    def invoke(self, prompt: Any) -> str:
        response = self.llm(prompt)
        score = response.strip()
        return score

Example Data

Prepare the input data for evaluation.

context = [ContextData(page_content="Test context")]
response = "Test response"
input_data = EvaluationInput(context=context, response=response)

Evaluating the Model

Use the evaluation functions to evaluate the model’s performance.

# Replace 'your-openai-api-key' with your actual OpenAI API key
api_key = 'your-openai-api-key'
openai_model = LangChainOpenAIModel(api_key)

try:
    # Evaluate with the LangChain OpenAI model
    faithfulness_result = evaluate_faithfulness(input_data, openai_model)
    context_recall_result = evaluate_context_recall(input_data, openai_model)
    answer_relevancy_result = evaluate_answer_relevancy(input_data, openai_model)
    context_relevancy_result = evaluate_context_relevancy(input_data, openai_model)

    print(faithfulness_result.score)
    print(context_recall_result.score)
    print(answer_relevancy_result.score)
    print(context_relevancy_result.score)
except ValueError as e:
    print(f"An error occurred during evaluation: {str(e)}")

🔍 Examples

Example with Custom Model

class CustomModel:
    def invoke(self, prompt):
        # Custom model implementation
        return "0.9"  # Example score

# Create a custom model instance
custom_model = CustomModel()

try:
    # Evaluate with the custom model
    faithfulness_result = evaluate_faithfulness(input_data, custom_model)
    context_recall_result = evaluate_context_recall(input_data, custom_model)
    answer_relevancy_result = evaluate_answer_relevancy(input_data, custom_model)
    context_relevancy_result = evaluate_context_relevancy(input_data, custom_model)

    print(faithfulness_result.score)
    print(context_recall_result.score)
    print(answer_relevancy_result.score)
    print(context_relevancy_result.score)
except ValueError as e:
    print(f"An error occurred during evaluation: {str(e)}")

🤝 Contributing

Contributions are welcome! Please read the contributing guidelines before making a pull request.

Steps to Contribute

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature-branch).
  3. Make your changes.
  4. Commit your changes (git commit -m 'Add new feature').
  5. Push to the branch (git push origin feature-branch).
  6. Open a pull request.

📜 License

LangGPTEval is licensed under the MIT License. See the LICENSE file for more details.

Happy Evaluating! 🎉

LangGPTEval is here to make your RAG model evaluations precise and easy. If you have any questions or need further assistance, feel free to reach out to me.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langrageval-0.1.1.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

LangRAGEval-0.1.1-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file langrageval-0.1.1.tar.gz.

File metadata

  • Download URL: langrageval-0.1.1.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for langrageval-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a358ec6dcf46097c1b309258922cbdef5eb92bba3419b387e27b5f74c9cdfc9c
MD5 8c3500066d7b461b7a3cefa5663abd51
BLAKE2b-256 f58fa258a1de1702abc1871455cb8e3185419d1ae97e59035c1274ccc08eca0b

See more details on using hashes here.

File details

Details for the file LangRAGEval-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: LangRAGEval-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for LangRAGEval-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 91f879feee995886d2f129f6326ff9bcf189a18ee1a7f48b0a0482c0b4dbfe73
MD5 466072abb59c1e5bb2bfc1037c6acf3c
BLAKE2b-256 fd0303c1bdbb34cd45ca97b6210fb2044cdae294c08ffe36f7592b26ee8acf65

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page