Evaluate the Quality of Critique
Project description
The Critique of Critique
This is the official repository for The Critique of Critique.
Table of contents
Introduction
We introduce MetaCritique, a new judge that can effectively evaluate human-written or LLMs-generated critique by generating critique.
Meta-P: precision score of MetaCritique that evaluates factuality of hypothesis critique.
Meta-R: recall score of MetaCritique that evaluates comprehensiveness of hypothesis critique.
Meta-F1: overall rating that is harmonic mean of precision score and recall score.
Leaderboard
We release the benchmarking results of multiple critique models.
Critique Model | Meta-Precision | Meta-Recall | Meta-F1 score |
---|---|---|---|
AUTO-J | 76.43 | 70.65 | 71.14 |
GPT 3.5 | 80.79 | 64.27 | 68.72 |
UltraCM | 73.64 | 66.77 | 67.79 |
Human Critique from Shepherd | 83.19 | 60.65 | 64.02 |
SelFee | 69.56 | 51.05 | 54.22 |
Quick Start
Installation
pip install meta-critique
Usage
from meta_critique import MetaCritique
api_key = ... # here is your OpenAi key
inputs = [
{"question": "<question>", "response": "<response1>", "hypothesis_critique": "<hypothesis_critique>"},
{"question": "<question>", "response": "<response2>", "hypothesis_critique": "<hypothesis_critique>"},
...
]
meta_critique_instance = MetaCritique(
model_type="gpt-4",
batch_size=5,
api_key=api_key,
api_base=None,
seed=None,
cache_dir="tmp_cache",
)
precision_score, recall_score, f1_score = meta_critique_instance.score(inputs)
where
question
: The user query for the model to generate the response.response
: The response generated by the model.hypothesis_critique
: The critique written by either human or LLMs.reference_answer
: (Optional) The reference answer.reference_critique
: (Optional) The reference critique.- str: a critique text
- dict: {"critique": <reference_critique>, "aius": <optional_aius_from_reference_critique>}
You can find a test sample from eval_examples/test_samples.json
Citation
If you find our work useful or use meta-critique, please cite our paper:
@article{sun2024metacritique,
title={The Critique of Critique},
author={Shichao Sun, Junlong Li, Weizhe Yuan, Ruifeng Yuan, Wenjie Li, Pengfei Liu},
journal={arXiv preprint arXiv:2401.04518},
year={2024}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file meta-critique-0.1.2.tar.gz
.
File metadata
- Download URL: meta-critique-0.1.2.tar.gz
- Upload date:
- Size: 18.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4fcc3eaf591cd7a37ffca07fd3ac4bcc08324833c05d632758e55069629d07fc |
|
MD5 | b09a118da34e00553b9708a4e06c7390 |
|
BLAKE2b-256 | 8bf5257472e8928db2e96182d16006ef79693f086cf396530d606e6de68a4374 |