An Open-source Factuality Evaluation Demo for LLMs
Project description
An Open-source Factuality Evaluation Demo for LLMs
Overview • Installation • Usage • HuggingFace Demo • Documentation
Overview
OpenFactCheck is an open-source repository designed to facilitate the evaluation and enhancement of factuality in responses generated by large language models (LLMs). This project aims to integrate various fact-checking tools into a unified framework and provide comprehensive evaluation pipelines.
Installation
You can install the package from PyPI using pip:
pip install openfactcheck
Usage
First, you need to initialize the OpenFactCheckConfig object and then the OpenFactCheck object.
from openfactcheck import OpenFactCheck, OpenFactCheckConfig
# Initialize the OpenFactCheck object
config = OpenFactCheckConfig()
ofc = OpenFactCheck(config)
Response Evaluation
You can evaluate a response using the ResponseEvaluator
class.
# Evaluate a response
result = ofc.ResponseEvaluator.evaluate(response: str)
LLM Evaluation
We provide FactQA, a dataset of 6480 questions for evaluating LLMs. Onc you have the responses from the LLM, you can evaluate them using the LLMEvaluator
class.
# Evaluate an LLM
result = ofc.LLMEvaluator.evaluate(model_name: str,
input_path: str)
Checker Evaluation
We provide FactBench, a dataset of 4507 claims for evaluating fact-checkers. Once you have the responses from the fact-checker, you can evaluate them using the CheckerEvaluator
class.
# Evaluate a fact-checker
result = ofc.CheckerEvaluator.evaluate(checker_name: str,
input_path: str)
Cite
If you use OpenFactCheck in your research, please cite the following:
@article{wang2024openfactcheck,
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
author = {Wang, Yuxia and Wang, Minghan and Iqbal, Hasan and Georgiev, Georgi and Geng, Jiahui and Nakov, Preslav},
journal = {arXiv preprint arXiv:2405.05583},
year = {2024}
}
@article{iqbal2024openfactcheck,
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
author = {Iqbal, Hasan and Wang, Yuxia and Wang, Minghan and Georgiev, Georgi and Geng, Jiahui and Gurevych, Iryna and Nakov, Preslav},
journal = {arXiv preprint arXiv:2408.11832},
year = {2024}
}
@software{hasan_iqbal_2024_13358665,
author = {Hasan Iqbal},
title = {hasaniqbal777/OpenFactCheck: v0.3.0},
month = {aug},
year = {2024},
publisher = {Zenodo},
version = {v0.3.0},
doi = {10.5281/zenodo.13358665},
url = {https://doi.org/10.5281/zenodo.13358665}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for openfactcheck-0.3.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 027a5c271e34b1e7908ca624e876087719c6b1d8a286377d14e9fc0c2cd9093f |
|
MD5 | 5efee6ad3c967316751f8471e909cace |
|
BLAKE2b-256 | d8e3c0cff01e7664f91742acd4106e935b743e3ba504ea98b441c231fb0ba551 |