Skip to main content

EasyEval

Project description

EasyEval


馃敡Installation

Installation for local development:

git clone https://github.com/zjunlp/EasyEval
cd EasyEval
pip install -e .

Installation using PyPI:

pip install easyeval -i https://pypi.org/simple

馃搶Use EasyEval

FairEval

FairEval is the class for two simple yet effective strategies, namely Multiple Evidence Calibration (MEC) and Balanced Position Calibration (BPC) to calibrate the positional bias of LLMs. Refer to the paper: Large Language Models are not Fair Evaluators.

Example

Step1: Provide the question json file for evaluation. Here is an example of the data:

{"question_id": 1, "text": "How can I improve my time management skills?"}
{"question_id": 2, "text": "What are the most effective ways to deal with stress?"}

Step2: Provide the answer json files for evaluation. Note that the question_id must be consistent with the question file. Here is an example of the data:

{"question_id": 1, "text": "Here are some tips to improve your time management skills:\n\n1. Create a schedule: Make a to-do list for the day ..."}
{"question_id": 2, "text": "Here are some effective ways to deal with stress:\n\n1. Exercise regularly: Physical activity can help reduce stress and improve mood ..."}

Step3: Evaluation

from EasyEval.eval import FairEval

# Declare a eval class
eval = FairEval(answer_file_list=["YOUR-ANSWER-FILE1", "YOUR-ANSWER-FILE2"], question_file="YOUR-QUESTION-FILE",
                output="YOUR-OUTPUT-FILE", api_key="YOUR-KEY", eval_model='gpt-4', bpc=1, k=3)


# Get the result from LLM API service
eval.fair_eval()

馃帀Contributors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easyeval-tool-0.0.1.tar.gz (4.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page