FactScore is an automatic evaluation metric for factual precision in long-form text generation. It uses large language models and retrieval to break down generations into atomic facts and then measure the correctness with respect to a knowledge source (like Wikipedia).

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

FActScore

This is the official release accompanying our preprint, "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation". FActScore is available as a PIP package as well.

Install

python3.7 -m virtualenv fs-venv
pip install factscore
python -m spacy download en_core_web_lg

Download the data

python -m factscore.download_data

Or, download it manually from this Google Drive link. Make a cache directory .cache/factscore, and place unzipped demos and enwiki-20230401.db in that directory.

Running the script with oracle atomic facts

python -m factscore.factscorer --data_path {data_path} --model_name {estimator_name} --cache_dir {cache_dir} --openai_key {openai_key}

data_path can be something like data/src-light/bio_ChatGPT_v0.2.jsonl which is in a format we have been using so far. TODO for simplying the format and allowing it to take any topics/generations.
model_name: retrieval+llama, retrieval+llama+npm, retrieval+ChatGPT, retrieval+ChatGPT+npm
cache_dir: .cache/factscore by default.
openai_key: File containing API Key, only needed when ChatGPT is being used.

For example,

python -m factscore.factscorer \
    --data_path original_generation/v0.2/answers_mpt-7b_bio_test_addtional.jsonl \
    --model_name "retrieval+ChatGPT" \
    --cache_dir ".cache/factscore" \
    --openai_key "api.key"

It uses enwiki-20230401 by default, and will download the database from our Google drive. It also uses Inst-LLAMA, downloading from the Google Drive. TODO: need to release diff from LLAMA 7B only. Also need to allow users to specify their own LM path if they want to use a different LM.

To use a custom knowledge source.

You need a .jsonl file where each line is a dictionary containing title and text. text can either be a string or a list of strings (e.g., sections).

from factscore.factscorer import FactScorer

fs = FactScorer()

# this will create a database using your file
# for English Wikipedia (18GB)), it takes ~8 hours
# once DB file is created, you can reuse it by only specifying `db_path`
fs.register_knowledge_source(name_of_your_knowledge_source,
                             data_path=path_to_jsonl_file,
                             db_path=path_to_output_db_file)

# now, when you compute a score, specify knowledge source to use
score = fs.get_score(topics, generations, knowledge_source=name_of_your_knowledge_source)

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.2.0

Oct 14, 2023

0.1.7

Jun 27, 2023

0.1.6

Jun 27, 2023

0.1.5

Jun 6, 2023

0.1.4

May 23, 2023

0.1.3

May 23, 2023

0.1.2

May 23, 2023

0.1.1

May 23, 2023

This version

0.1.0

May 23, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

factscore-0.1.0.tar.gz (18.9 kB view hashes)

Uploaded May 23, 2023 Source

Built Distribution

factscore-0.1.0-py3-none-any.whl (21.2 kB view hashes)

Uploaded May 23, 2023 Python 3

Hashes for factscore-0.1.0.tar.gz

Hashes for factscore-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2a5e461a8d5fdd96ab867c212242fd0de99ac2d2d1f782b34a058a0cdb4f772f`
MD5	`3e0a1b2dde777ce684f74ae50fe2b823`
BLAKE2b-256	`9abd9e0450dd157441f26f3d82b3f7b0b7da7e740541eae17b96e6e62d9e899b`

Hashes for factscore-0.1.0-py3-none-any.whl

Hashes for factscore-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d6601f3bc98d5b0ad86f5551861e8f94589934c3ecdc0d0af4bfa0acbf658cc4`
MD5	`02af773a44850d17c55d7b69f638f7b0`
BLAKE2b-256	`6b0dab90bca940c8343bb4c857856b6aa4663c9b70203cf6702dac6ffe59bf9e`