Comprehensive AI Model Evaluation Framework with support for multiple LLM providers
Project description
Eval AI Library
Based on firstlinesoftware/eval-ai-library. This is an independently maintained version with additional features and PyPI distribution.
Comprehensive AI model evaluation framework for RAG systems and AI agents. Supports 35+ evaluation metrics, 12 LLM providers, built-in test data generation from documents, and an interactive web dashboard for visualization and analysis. Implements advanced techniques including G-Eval probability-weighted scoring and Temperature-Controlled Verdict Aggregation via Generalized Power Mean.
Installation
pip install eval-ai-library
Full version with document parsing and OCR support:
pip install eval-ai-library[full]
Lite version (core evaluation only):
pip install eval-ai-library[lite]
Quick Start
from eval_lib import EvalAI
evaluator = EvalAI(model="gpt-4o")
result = evaluator.evaluate(
input="What is Python?",
actual_output="Python is a programming language.",
expected_output="Python is a high-level programming language.",
metrics=["answer_relevancy", "faithfulness"]
)
print(result.score)
Documentation
Full documentation is available at library.eval-ai.com.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Citation
If you use this library in your research, please cite:
@software{eval_ai_library,
author = {Meshkov, Aleksandr},
title = {Eval AI Library: Comprehensive AI Model Evaluation Framework},
year = {2025},
url = {https://github.com/meshkovQA/Eval-ai-library.git}
}
References
This library implements techniques from:
@inproceedings{liu2023geval,
title={G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment},
author={Liu, Yang and Iter, Dan and Xu, Yichong and Wang, Shuohang and Xu, Ruochen and Zhu, Chenguang},
booktitle={Proceedings of EMNLP},
year={2023}
}
Support
- Issues: GitHub Issues
- Documentation: library.eval-ai.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eval_ai_library-0.7.11.tar.gz.
File metadata
- Download URL: eval_ai_library-0.7.11.tar.gz
- Upload date:
- Size: 263.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e1bb3372f73cd1defe264256d8ca243667e00fbd6ce4a0855322afefb983f2e
|
|
| MD5 |
ef4f713f51dc189f7000bcd7fdcad71c
|
|
| BLAKE2b-256 |
6fc771b31ec60a1aa2d1507a9adfad40ae16b656cda9cba6913a9aba26cf7a41
|
File details
Details for the file eval_ai_library-0.7.11-py3-none-any.whl.
File metadata
- Download URL: eval_ai_library-0.7.11-py3-none-any.whl
- Upload date:
- Size: 277.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72e3e6819652331f658748ad9a49c2b9bc9236405ebf877f0ac984bf9f41e81e
|
|
| MD5 |
b3c4e94d5135ec5ccfac116739ebb6a6
|
|
| BLAKE2b-256 |
fad35d910727a66679d55840007a340f121b1600c3ffff95cd93d508673bc846
|