LangRAGEval is a library for evaluating responses based on faithfulness, context recall, answer relevancy, and context relevancy.
Project description
LangGPTEval
LangGPTEval, Evaluation library designed for Retrieval-Augmented Generation (RAG) responses.Evaluates the faithfulness, context recall, answer relevancy, and context relevancy of responses generated by various models, including OpenAI, Azure, and custom models. With a complex architecture and advanced Pydantic validation, LangGPTEval ensures reliable and accurate evaluation metrics.
🌟 Introduction
LangGPTEval is designed to evaluate the quality of responses generated by RAG models. It supports multiple metrics for evaluation:
- Faithfulness: How true the response is to the given context.
- Context Recall: How well the response recalls the given context.
- Answer Relevancy: How relevant the response is to the question.
- Context Relevancy: How relevant the response is to the context.
LangGPTEval is highly customizable, allowing users to plug in their models and tailor the evaluation to their specific needs.
🛠️ Installation
You can install LangGPTEval using pip:
pip install LangGPTEval
⚡ Quick Start
Here’s a quick start guide to get you up and running with LangGPTEval.
- Install the library.
- Prepare your data.
- Evaluate your model.
📚 Usage
Importing the Library
First, import the necessary components from the LangGPTEval library.
from LangGPTEval.models import EvaluationInput, ContextData
from LangGPTEval.evaluation import evaluate_faithfulness, evaluate_context_recall, evaluate_answer_relevancy, evaluate_context_relevancy
from langchain.llms import OpenAI
Setting Up Your Model
Create an instance of your model. Here, we demonstrate using LangChain’s OpenAI model.
class LangChainOpenAIModel:
def __init__(self, api_key: str):
self.llm = OpenAI(api_key=api_key)
def invoke(self, prompt: Any) -> str:
response = self.llm(prompt)
score = response.strip()
return score
Example Data
Prepare the input data for evaluation.
context = [ContextData(page_content="Test context")]
response = "Test response"
input_data = EvaluationInput(context=context, response=response)
Evaluating the Model
Use the evaluation functions to evaluate the model’s performance.
# Replace 'your-openai-api-key' with your actual OpenAI API key
api_key = 'your-openai-api-key'
openai_model = LangChainOpenAIModel(api_key)
try:
# Evaluate with the LangChain OpenAI model
faithfulness_result = evaluate_faithfulness(input_data, openai_model)
context_recall_result = evaluate_context_recall(input_data, openai_model)
answer_relevancy_result = evaluate_answer_relevancy(input_data, openai_model)
context_relevancy_result = evaluate_context_relevancy(input_data, openai_model)
print(faithfulness_result.score)
print(context_recall_result.score)
print(answer_relevancy_result.score)
print(context_relevancy_result.score)
except ValueError as e:
print(f"An error occurred during evaluation: {str(e)}")
🔍 Examples
Example with Custom Model
class CustomModel:
def invoke(self, prompt):
# Custom model implementation
return "0.9" # Example score
# Create a custom model instance
custom_model = CustomModel()
try:
# Evaluate with the custom model
faithfulness_result = evaluate_faithfulness(input_data, custom_model)
context_recall_result = evaluate_context_recall(input_data, custom_model)
answer_relevancy_result = evaluate_answer_relevancy(input_data, custom_model)
context_relevancy_result = evaluate_context_relevancy(input_data, custom_model)
print(faithfulness_result.score)
print(context_recall_result.score)
print(answer_relevancy_result.score)
print(context_relevancy_result.score)
except ValueError as e:
print(f"An error occurred during evaluation: {str(e)}")
🤝 Contributing
Contributions are welcome! Please read the contributing guidelines before making a pull request.
Steps to Contribute
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Make your changes.
- Commit your changes (
git commit -m 'Add new feature'
). - Push to the branch (
git push origin feature-branch
). - Open a pull request.
📜 License
LangGPTEval is licensed under the MIT License. See the LICENSE file for more details.
Happy Evaluating! 🎉
LangGPTEval is here to make your RAG model evaluations precise and easy. If you have any questions or need further assistance, feel free to reach out to me.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file langrageval-0.1.1.tar.gz
.
File metadata
- Download URL: langrageval-0.1.1.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a358ec6dcf46097c1b309258922cbdef5eb92bba3419b387e27b5f74c9cdfc9c |
|
MD5 | 8c3500066d7b461b7a3cefa5663abd51 |
|
BLAKE2b-256 | f58fa258a1de1702abc1871455cb8e3185419d1ae97e59035c1274ccc08eca0b |
File details
Details for the file LangRAGEval-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: LangRAGEval-0.1.1-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91f879feee995886d2f129f6326ff9bcf189a18ee1a7f48b0a0482c0b4dbfe73 |
|
MD5 | 466072abb59c1e5bb2bfc1037c6acf3c |
|
BLAKE2b-256 | fd0303c1bdbb34cd45ca97b6210fb2044cdae294c08ffe36f7592b26ee8acf65 |