An Open-source Factuality Evaluation Demo for LLMs
Project description
An Open-source Factuality Evaluation Demo for LLMs
Overview • Installation • Usage • HuggingFace Demo • Documentation
Overview
OpenFactCheck is an open-source repository designed to facilitate the evaluation and enhancement of factuality in responses generated by large language models (LLMs). This project aims to integrate various fact-checking tools into a unified framework and provide comprehensive evaluation pipelines.
Installation
You can install the package from PyPI using pip:
pip install openfactcheck
Usage
First, you need to initialize the OpenFactCheckConfig object and then the OpenFactCheck object.
from openfactcheck import OpenFactCheck, OpenFactCheckConfig
# Initialize the OpenFactCheck object
config = OpenFactCheckConfig()
ofc = OpenFactCheck(config)
Response Evaluation
You can evaluate a response using the ResponseEvaluator
class.
# Evaluate a response
result = ofc.ResponseEvaluator.evaluate(response: str)
LLM Evaluation
We provide FactQA, a dataset of 6480 questions for evaluating LLMs. Onc you have the responses from the LLM, you can evaluate them using the LLMEvaluator
class.
# Evaluate an LLM
result = ofc.LLMEvaluator.evaluate(model_name: str,
input_path: str)
Checker Evaluation
We provide FactBench, a dataset of 4507 claims for evaluating fact-checkers. Once you have the responses from the fact-checker, you can evaluate them using the CheckerEvaluator
class.
# Evaluate a fact-checker
result = ofc.CheckerEvaluator.evaluate(checker_name: str,
input_path: str)
Cite
If you use OpenFactCheck in your research, please cite the following:
@article{wang2024openfactcheck,
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
author = {Wang, Yuxia and Wang, Minghan and Iqbal, Hasan and Georgiev, Georgi and Geng, Jiahui and Nakov, Preslav},
journal = {arXiv preprint arXiv:2405.05583},
year = {2024}
}
@article{iqbal2024openfactcheck,
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
author = {Iqbal, Hasan and Wang, Yuxia and Wang, Minghan and Georgiev, Georgi and Geng, Jiahui and Gurevych, Iryna and Nakov, Preslav},
journal = {arXiv preprint arXiv:2408.11832},
year = {2024}
}
@software{hasan_iqbal_2024_13358665,
author = {Hasan Iqbal},
title = {hasaniqbal777/OpenFactCheck: v0.3.0},
month = {aug},
year = {2024},
publisher = {Zenodo},
version = {v0.3.0},
doi = {10.5281/zenodo.13358665},
url = {https://doi.org/10.5281/zenodo.13358665}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file openfactcheck-0.3.10rc1.tar.gz
.
File metadata
- Download URL: openfactcheck-0.3.10rc1.tar.gz
- Upload date:
- Size: 6.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8cab935a7da6593fa5b3f489d51d0b7801abd95a1b65dc89b2f86c5dd5e02a3 |
|
MD5 | 8d685b18d87d9ef48a59d9ec2ec63502 |
|
BLAKE2b-256 | dd9cc9c2101f793037d4ff911604530bdaad3fc43edf82d15e5343b17916f39e |
File details
Details for the file openfactcheck-0.3.10rc1-py3-none-any.whl
.
File metadata
- Download URL: openfactcheck-0.3.10rc1-py3-none-any.whl
- Upload date:
- Size: 6.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a72b976eeddaea7cd68d106466b6b7443e73133a0fcf406eac5d9ca0a01da51a |
|
MD5 | 5ce325262b3dadf37402dff6ce84d53e |
|
BLAKE2b-256 | 992b6b9e61af2a76e640d7fc4b7219546070d9005ebdd5a07ab3f13eeaf5add1 |