Haystack node for checking the entailment between a statement and a list of Documents
Project description
Haystack Entailment Checker
Custom node for the Haystack NLP framework. Using a Natural Language Inference model, it checks whether a lists of Documents/passages entails, contradicts or is neutral with respect to a given statement.
Live Demo: Fact Checking 🎸 Rocks!
How it works
- The node takes a list of Documents (commonly returned by a Retriever) and a statement as input.
- Using a Natural Language Inference model, the text entailment between each text passage/Document (premise) and the statement (hypothesis) is computed. For every text passage, we get 3 scores (summing to 1): entailment, contradiction and neutral.
- The text entailment scores are aggregated using a weighted average. The weight is the relevance score of each passage returned by the Retriever, if availaible. It expresses the similarity between the text passage and the statement. Now we have a summary score, so it is possible to tell if the passages confirm, are neutral or disprove the user statement.
- empirical consideration: if in the first N passages (N<K), there is strong evidence of entailment/contradiction (partial aggregate scores > threshold), it is better not to consider (K-N) less relevant documents.
Installation
pip install haystack-entailment-checker
Usage
Basic example
from haystack import Document
from haystack_entailment_checker import EntailmentChecker
ec = EntailmentChecker(
model_name_or_path = "microsoft/deberta-v2-xlarge-mnli",
use_gpu = False,
entailment_contradiction_threshold = 0.5)
doc = Document("My cat is lazy")
print(ec.run("My cat is very active", [doc]))
# ({'documents': [...],
# 'aggregate_entailment_info': {'contradiction': 1.0, 'neutral': 0.0, 'entailment': 0.0}}, ...)
Fact-checking pipeline (Retriever + EntailmentChecker)
from haystack import Document, Pipeline
from haystack.nodes import BM25Retriever
from haystack.document_stores import InMemoryDocumentStore
from haystack_entailment_checker import EntailmentChecker
# INDEXING
# the knowledge base can consist of many documents
docs = [...]
ds = InMemoryDocumentStore(use_bm25=True)
ds.write_documents(docs)
# QUERYING
retriever = BM25Retriever(document_store=ds)
ec = EntailmentChecker()
pipe = Pipeline()
pipe.add_node(component=retriever, name="Retriever", inputs=["Query"])
pipe.add_node(component=ec, name="EntailmentChecker", inputs=["Retriever"])
pipe.run(query="YOUR STATEMENT TO CHECK")
Acknowledgements 🙏
Special thanks goes to @davidberenstein1957, who contributed to the original implementation of this node, in the Fact Checking 🎸 Rocks! project.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file haystack_entailment_checker-0.0.4.tar.gz
.
File metadata
- Download URL: haystack_entailment_checker-0.0.4.tar.gz
- Upload date:
- Size: 213.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.24.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | beef71150b62a3aa870a96829ad340808de98821db79d4ce4dc965f504a6ee9c |
|
MD5 | 8a43fa91aa09a30b462d73dcc767078c |
|
BLAKE2b-256 | 7096a41b6157ca0147cdff01c5e9b5209cb928fa385dc1216b8ccd8054e98d47 |
File details
Details for the file haystack_entailment_checker-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: haystack_entailment_checker-0.0.4-py3-none-any.whl
- Upload date:
- Size: 4.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.24.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | da9ff320dac544ca68d43bbcbe8d2b2904a208e8263a34136531eb5b85ccb9c5 |
|
MD5 | c7012a3bfaed047a5ffd2eb6530d1923 |
|
BLAKE2b-256 | fed3567ce33b610d778d9d1af62fbd6438823b5a7bc3f2963111e8eaf66daaa3 |