Skip to main content

FactScore is an automatic evaluation metric for factual precision in long-form text generation. It uses large language models and retrieval to break down generations into atomic facts and then measure the correctness with respect to a knowledge source (like Wikipedia).

Project description

FActScore

made-with-python PyPI version factscore Downloads

This is the official release accompanying our preprint, "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation". FActScore is available as a PIP package as well.

Install

Make a new Python 3.7+ environment using virtualenv or conda.

pip install factscore
python -m spacy download en_core_web_sm

Download the data

python -m factscore.download_data

Or, download it manually from this Google Drive link. Make a cache directory .cache/factscore, and place unzipped demos and enwiki-20230401.db in that directory.

Running FactScore

python -m factscore.factscorer --data_path {data_path} --model_name {estimator_name} --cache_dir {cache_dir} --openai_key {openai_key}
  • data_path can be something like data/src-light/bio_ChatGPT_v0.2.jsonl which is in a format we have been using so far. TODO for simplying the format and allowing it to take any topics/generations.
  • model_name: retrieval+ChatGPT, retrieval+ChatGPT+npm, two more configs (retrieval+llama, retrieval+llama+npm) coming soon!
  • cache_dir: .cache/factscore by default.
  • openai_key: File containing API Key, needed when ChatGPT is being used.

For example,

python -m factscore.factscorer \
    --data_path original_generation/v0.2/answers_mpt-7b_bio_test_addtional.jsonl \
    --model_name "retrieval+ChatGPT" \
    --cache_dir ".cache/factscore" \
    --openai_key "api.key"

It uses enwiki-20230401 by default, and will download the database from our Google drive.

Instructions to use Instruct-LLAMA-7B or your own LM coming soon!

To use a custom knowledge source.

You need a .jsonl file where each line is a dictionary containing title and text. text can either be a string or a list of strings (e.g., sections).

from factscore.factscorer import FactScorer

fs = FactScorer()

# this will create a database using your file
# for English Wikipedia (18GB)), it takes ~8 hours
# once DB file is created, you can reuse it by only specifying `db_path`
fs.register_knowledge_source(name_of_your_knowledge_source,
                             data_path=path_to_jsonl_file,
                             db_path=path_to_output_db_file)

# now, when you compute a score, specify knowledge source to use
score = fs.get_score(topics, generations, knowledge_source=name_of_your_knowledge_source)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

factscore-0.1.2.tar.gz (18.8 kB view hashes)

Uploaded Source

Built Distribution

factscore-0.1.2-py3-none-any.whl (21.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page