A powerful web content fetcher and processor
Project description
EvalLite 🚀
An efficient, zero-cost LLM evaluation framework combining the simplicity of DeepEval with the power of free Hugging Face models through AILite.
🌟 Key Features
- Zero-Cost Evaluation: Leverage free Hugging Face models for LLM evaluation
- Simple Integration: Drop-in replacement for DeepEval's evaluation capabilities
- Extensive Model Support: Access to leading open-source models including:
- Meta Llama 3.1 70B Instruct
- Qwen 2.5 72B Instruct
- Mistral Nemo Instruct
- Phi-3.5 Mini Instruct
- And more!
- Comprehensive Metrics: Full compatibility with DeepEval's evaluation metrics
- Async Support: Built-in asynchronous evaluation capabilities
📥 Installation
pip install evallite
🚀 Quick Start
Here's a simple example to get you started with EvalLite:
from evallite import (
assert_test,
EvalLiteModel,
LLMTestCase,
evaluate,
AnswerRelevancyMetric
)
# Initialize metric with a specific model
answer_relevancy_metric = AnswerRelevancyMetric(
threshold=0.7,
model=EvalLiteModel(model="microsoft/Phi-3.5-mini-instruct")
)
# Create a test case
test_case = LLMTestCase(
input="What if these shoes don't fit?",
actual_output="We offer a 30-day full refund at no extra costs.",
retrieval_context=["All customers are eligible for a 30 day full refund at no extra costs."]
)
# Run evaluation
evaluate([test_case], [answer_relevancy_metric])
🔧 Available Models
EvalLite supports several powerful open-source models:
from evallite import EvalLiteModel
# Available model options
models = [
'meta-llama/Meta-Llama-3.1-70B-Instruct',
'CohereForAI/c4ai-command-r-plus-08-2024',
'Qwen/Qwen2.5-72B-Instruct',
'nvidia/Llama-3.1-Nemotron-70B-Instruct-HF',
'meta-llama/Llama-3.2-11B-Vision-Instruct',
'NousResearch/Hermes-3-Llama-3.1-8B',
'mistralai/Mistral-Nemo-Instruct-2407',
'microsoft/Phi-3.5-mini-instruct'
]
# Initialize with specific model
evaluator = EvalLiteModel(model='microsoft/Phi-3.5-mini-instruct')
📊 Advanced Usage
Custom Schema Support
EvalLite supports custom response schemas using Pydantic models:
from pydantic import BaseModel
from typing import List
class Statements(BaseModel):
statements: List[str]
# Use with schema
result = evaluator.generate(
prompt="List three facts about climate change",
schema=Statements
)
Async Evaluation
async def evaluate_async():
response = await evaluator.a_generate(
prompt="What is the capital of France?",
schema=Statements
)
return response
Batch Evaluation
from evallite import EvaluationDataset
# Create multiple test cases
test_cases = [
LLMTestCase(
input="Question 1",
actual_output="Answer 1",
retrieval_context=["Context 1"]
),
LLMTestCase(
input="Question 2",
actual_output="Answer 2",
retrieval_context=["Context 2"]
)
]
# Create dataset
dataset = EvaluationDataset(test_cases=test_cases)
# Evaluate all at once
evaluate(dataset, [answer_relevancy_metric])
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
📄 License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
🙏 Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file evallite-0.2.0.tar.gz.
File metadata
- Download URL: evallite-0.2.0.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd50bc45b05c24c9d313568e2782efdddde5d8858616b99e05f5c2a9196f3556
|
|
| MD5 |
14252a88030e466f530a208f3f28445e
|
|
| BLAKE2b-256 |
60aed8550a42798c2874440a12c66296881e4a54d83d9af47691d57f965fe526
|
File details
Details for the file evallite-0.2.0-py3-none-any.whl.
File metadata
- Download URL: evallite-0.2.0-py3-none-any.whl
- Upload date:
- Size: 4.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
802bbdefce814e7e71897644b2aaba50fff63f37ecd0a9e20dc8f774fd086c2d
|
|
| MD5 |
57c112eb22cda7f42f3a5507624ac698
|
|
| BLAKE2b-256 |
f9ec3a6cd2d7c5eda07042eb6c5b176f34eddc5cd6748aaf5dcc69ddb9cccb5c
|