An Open-source Factuality Evaluation Demo for LLMs
Project description
An Open-source Factuality Evaluation Demo for LLMs
Overview • Installation • Usage • HuggingFace Demo • Documentation
Overview
OpenFactCheck is an open-source repository designed to facilitate the evaluation and enhancement of factuality in responses generated by large language models (LLMs). This project aims to integrate various fact-checking tools into a unified framework and provide comprehensive evaluation pipelines.
Installation
You can install the package from PyPI using pip:
pip install openfactcheck
Usage
First, you need to initialize the OpenFactCheckConfig object and then the OpenFactCheck object.
from openfactcheck import OpenFactCheck, OpenFactCheckConfig
# Initialize the OpenFactCheck object
config = OpenFactCheckConfig()
ofc = OpenFactCheck(config)
Response Evaluation
You can evaluate a response using the ResponseEvaluator class.
# Evaluate a response
result = ofc.ResponseEvaluator.evaluate(response: str)
LLM Evaluation
We provide FactQA, a dataset of 6480 questions for evaluating LLMs. Onc you have the responses from the LLM, you can evaluate them using the LLMEvaluator class.
# Evaluate an LLM
result = ofc.LLMEvaluator.evaluate(model_name: str,
input_path: str)
Checker Evaluation
We provide FactBench, a dataset of 4507 claims for evaluating fact-checkers. Once you have the responses from the fact-checker, you can evaluate them using the CheckerEvaluator class.
# Evaluate a fact-checker
result = ofc.CheckerEvaluator.evaluate(checker_name: str,
input_path: str)
Cite
If you use OpenFactCheck in your research, please cite the following:
@article{wang2024openfactcheck,
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
author = {Wang, Yuxia and Wang, Minghan and Iqbal, Hasan and Georgiev, Georgi and Geng, Jiahui and Nakov, Preslav},
journal = {arXiv preprint arXiv:2405.05583},
year = {2024}
}
@article{iqbal2024openfactcheck,
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
author = {Iqbal, Hasan and Wang, Yuxia and Wang, Minghan and Georgiev, Georgi and Geng, Jiahui and Gurevych, Iryna and Nakov, Preslav},
journal = {arXiv preprint arXiv:2408.11832},
year = {2024}
}
@software{hasan_iqbal_2024_13358665,
author = {Hasan Iqbal},
title = {hasaniqbal777/OpenFactCheck: v0.3.0},
month = {aug},
year = {2024},
publisher = {Zenodo},
version = {v0.3.0},
doi = {10.5281/zenodo.13358665},
url = {https://doi.org/10.5281/zenodo.13358665}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openfactcheck-0.3.15.tar.gz.
File metadata
- Download URL: openfactcheck-0.3.15.tar.gz
- Upload date:
- Size: 6.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57fae6e55bf5f2e39b7adab776fa9cc8ee36f70f5e0eafaa6fa6ffad06feb261
|
|
| MD5 |
9b1ca978bb6da38098b3494603fe4b03
|
|
| BLAKE2b-256 |
d81b4637db3cef95b5d51c5bbd4f5ed20a8cc3cf9f1d722f0313239634e27103
|
File details
Details for the file openfactcheck-0.3.15-py3-none-any.whl.
File metadata
- Download URL: openfactcheck-0.3.15-py3-none-any.whl
- Upload date:
- Size: 6.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de4264ddcea37eeafe5123eab2895ca4d5eca00833f82d7f84bc0d496139affa
|
|
| MD5 |
7bca86fd4f99bc0aa388c7f77970daf8
|
|
| BLAKE2b-256 |
a78cabd0639912247174c63cd9fb0e7d8d2843ed7a8f8588772a45ed1944dc4b
|