A zero-shot relation extractor
Project description
Fact checking
This generative model - trained on FEVER - aims to predict whether a claim is consistent with the provided evidence.
Installation and simple usage
One quick way to install it is to type
pip install fact-checking
and then use the following code:
from transformers import (
GPT2LMHeadModel,
GPT2Tokenizer,
)
from fact_checking import FactChecker
_evidence = """
Justine Tanya Bateman (born February 19, 1966) is an American writer, producer, and actress . She is best known for her regular role as Mallory Keaton on the sitcom Family Ties (1982 -- 1989). Until recently, Bateman ran a production and consulting company, SECTION 5 . In the fall of 2012, she started studying computer science at UCLA.
"""
_claim = 'Justine Bateman is a poet.'
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
fact_checking_model = GPT2LMHeadModel.from_pretrained('fractalego/fact-checking')
fact_checker = FactChecker(fact_checking_model, tokenizer)
is_claim_true = fact_checker.validate(_evidence, _claim)
print(is_claim_true)
which gives the output
True
Probabilistic output with replicas
The output can include a probabilistic component, obtained by iterating a number of times the output generation. The system generates an ensemble of answers and groups them by Yes or No.
For example, one can ask
from transformers import (
GPT2LMHeadModel,
GPT2Tokenizer,
)
from fact_checking import FactChecker
_evidence = """
Jane writes code for Huggingface.
"""
_claim = 'Jane is an engineer.'
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
fact_checking_model = GPT2LMHeadModel.from_pretrained('fractalego/fact-checking')
fact_checker = FactChecker(fact_checking_model, tokenizer)
is_claim_true = fact_checker.validate_with_replicas(_evidence, _claim)
print(is_claim_true)
with output
{'Y': 0.95, 'N': 0.05}
Score on FEVER
The score on the FEVER dev dataset is as follows
precision | recall | F1 |
---|---|---|
0.94 | 0.98 | 0.96 |
These results should be taken with many grains of salt. This is still a work in progress, and there might be leakage coming from the underlining GPT2 model unnaturally raising the scores.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fact_checking-0.0.1.tar.gz
.
File metadata
- Download URL: fact_checking-0.0.1.tar.gz
- Upload date:
- Size: 3.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44ba1993f7cbf49264f1bba372c475ade95bbf40fadacffe439dd1cc39b84cf2 |
|
MD5 | d7071ebaffa6754a8cb1b2c129b52f44 |
|
BLAKE2b-256 | f26653cbbc5129ee86d49e2088741e69f5fe50cd22469216b40fdaa40f5c8730 |
File details
Details for the file fact_checking-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: fact_checking-0.0.1-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 90b6ca0be9fae36cb50c857bac4a012bb5fa1d2a689d80cc24f3b9d58e0d0cf5 |
|
MD5 | 2ef469f89a80d87cbfcbf9ff8380d294 |
|
BLAKE2b-256 | c052e4764db89fdea5380083d97a04e0278426c0a4a2b2a018c5439d1ffa06ee |