Skip to main content

A powerful web content fetcher and processor

Project description

EvalLite 🚀

PyPI version License: MIT Python Versions

An efficient, zero-cost LLM evaluation framework combining the simplicity of DeepEval with the power of free Hugging Face models through AILite.

License

🌟 Key Features

  • Zero-Cost Evaluation: Leverage free Hugging Face models for LLM evaluation
  • Simple Integration: Drop-in replacement for DeepEval's evaluation capabilities
  • Extensive Model Support: Access to leading open-source models including:
    • Meta Llama 3.1 70B Instruct
    • Qwen 2.5 72B Instruct
    • Mistral Nemo Instruct
    • Phi-3.5 Mini Instruct
    • And more!
  • Comprehensive Metrics: Full compatibility with DeepEval's evaluation metrics
  • Async Support: Built-in asynchronous evaluation capabilities

📥 Installation

pip install evallite

🚀 Quick Start

Here's a simple example to get you started with EvalLite:

from evallite import (
    assert_test,
    EvalLiteModel,
    LLMTestCase,
    evaluate,
    AnswerRelevancyMetric
)

# Initialize metric with a specific model
answer_relevancy_metric = AnswerRelevancyMetric(
    threshold=0.7,
    model=EvalLiteModel(model="microsoft/Phi-3.5-mini-instruct")
)

# Create a test case
test_case = LLMTestCase(
    input="What if these shoes don't fit?",
    actual_output="We offer a 30-day full refund at no extra costs.",
    retrieval_context=["All customers are eligible for a 30 day full refund at no extra costs."]
)

# Run evaluation
evaluate([test_case], [answer_relevancy_metric])

🔧 Available Models

EvalLite supports several powerful open-source models:

from evallite import EvalLiteModel

# Available model options
models = [
    'meta-llama/Meta-Llama-3.1-70B-Instruct',
    'CohereForAI/c4ai-command-r-plus-08-2024',
    'Qwen/Qwen2.5-72B-Instruct',
    'nvidia/Llama-3.1-Nemotron-70B-Instruct-HF',
    'meta-llama/Llama-3.2-11B-Vision-Instruct',
    'NousResearch/Hermes-3-Llama-3.1-8B',
    'mistralai/Mistral-Nemo-Instruct-2407',
    'microsoft/Phi-3.5-mini-instruct'
]

# Initialize with specific model
evaluator = EvalLiteModel(model='microsoft/Phi-3.5-mini-instruct')

📊 Advanced Usage

Custom Schema Support

EvalLite supports custom response schemas using Pydantic models:

from pydantic import BaseModel
from typing import List

class Statements(BaseModel):
    statements: List[str]

# Use with schema
result = evaluator.generate(
    prompt="List three facts about climate change",
    schema=Statements
)

Async Evaluation

async def evaluate_async():
    response = await evaluator.a_generate(
        prompt="What is the capital of France?",
        schema=Statements
    )
    return response

Batch Evaluation

from evallite import EvaluationDataset

# Create multiple test cases
test_cases = [
    LLMTestCase(
        input="Question 1",
        actual_output="Answer 1",
        retrieval_context=["Context 1"]
    ),
    LLMTestCase(
        input="Question 2",
        actual_output="Answer 2",
        retrieval_context=["Context 2"]
    )
]

# Create dataset
dataset = EvaluationDataset(test_cases=test_cases)

# Evaluate all at once
evaluate(dataset, [answer_relevancy_metric])

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📄 License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

🙏 Acknowledgments

  • DeepEval for the evaluation framework
  • AILite for providing free model access
  • The open-source community for making powerful language models accessible

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evallite-0.2.0.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

evallite-0.2.0-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file evallite-0.2.0.tar.gz.

File metadata

  • Download URL: evallite-0.2.0.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for evallite-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fd50bc45b05c24c9d313568e2782efdddde5d8858616b99e05f5c2a9196f3556
MD5 14252a88030e466f530a208f3f28445e
BLAKE2b-256 60aed8550a42798c2874440a12c66296881e4a54d83d9af47691d57f965fe526

See more details on using hashes here.

File details

Details for the file evallite-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: evallite-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for evallite-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 802bbdefce814e7e71897644b2aaba50fff63f37ecd0a9e20dc8f774fd086c2d
MD5 57c112eb22cda7f42f3a5507624ac698
BLAKE2b-256 f9ec3a6cd2d7c5eda07042eb6c5b176f34eddc5cd6748aaf5dcc69ddb9cccb5c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page