Skip to main content

Comprehensive AI Model Evaluation Framework with support for multiple LLM providers

Project description

Eval AI Library

Python Version License PyPI

Based on firstlinesoftware/eval-ai-library. This is an independently maintained version with additional features and PyPI distribution.

Comprehensive AI model evaluation framework for RAG systems and AI agents. Supports 35+ evaluation metrics, 12 LLM providers, built-in test data generation from documents, and an interactive web dashboard for visualization and analysis. Implements advanced techniques including G-Eval probability-weighted scoring and Temperature-Controlled Verdict Aggregation via Generalized Power Mean.

Installation

pip install eval-ai-library

Full version with document parsing and OCR support:

pip install eval-ai-library[full]

Lite version (core evaluation only):

pip install eval-ai-library[lite]

Quick Start

from eval_lib import EvalAI

evaluator = EvalAI(model="gpt-4o")

result = evaluator.evaluate(
    input="What is Python?",
    actual_output="Python is a programming language.",
    expected_output="Python is a high-level programming language.",
    metrics=["answer_relevancy", "faithfulness"]
)

print(result.score)

Documentation

Full documentation is available at library.eval-ai.com.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Citation

If you use this library in your research, please cite:

@software{eval_ai_library,
  author = {Meshkov, Aleksandr},
  title = {Eval AI Library: Comprehensive AI Model Evaluation Framework},
  year = {2025},
  url = {https://github.com/meshkovQA/Eval-ai-library.git}
}

References

This library implements techniques from:

@inproceedings{liu2023geval,
  title={G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment},
  author={Liu, Yang and Iter, Dan and Xu, Yichong and Wang, Shuohang and Xu, Ruochen and Zhu, Chenguang},
  booktitle={Proceedings of EMNLP},
  year={2023}
}

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eval_ai_library-0.7.7.tar.gz (247.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eval_ai_library-0.7.7-py3-none-any.whl (269.7 kB view details)

Uploaded Python 3

File details

Details for the file eval_ai_library-0.7.7.tar.gz.

File metadata

  • Download URL: eval_ai_library-0.7.7.tar.gz
  • Upload date:
  • Size: 247.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for eval_ai_library-0.7.7.tar.gz
Algorithm Hash digest
SHA256 8852f9f4981baacb0c929c5ba27733741e1838beb7e90099c1e29d66d539b2de
MD5 e330fada43f1ff7ee265f0f044ecbdc6
BLAKE2b-256 2c4ad285fea111397539b7a388b05a46684ada6322411a5eb69d2b5b23f63a19

See more details on using hashes here.

File details

Details for the file eval_ai_library-0.7.7-py3-none-any.whl.

File metadata

File hashes

Hashes for eval_ai_library-0.7.7-py3-none-any.whl
Algorithm Hash digest
SHA256 337c63e3bd3a16bc1355b1b3975eaed13694b2c7d85b696e39755191911d75fd
MD5 c5e453c07199a15378cf3ab471cf5111
BLAKE2b-256 157d7c1ac7941b9f91d4cc61077c5ab58c3cfcde37972f0b2b3d1626a8001c99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page