Skip to main content

A Python NLP Commonsense Knowledge Inference Toolkit

Project description

kogito

A Python NLP Commonsense Knowledge Inference Toolkit

System Description available here: https://arxiv.org/abs/2211.08451

Installation

Installation with pip

kogito can be installed using pip.

pip install kogito

It requires a minimum python version of 3.8.

Setup

Inference

kogito uses spacy under the hood for various text processing purposes, so, a spacy language package has to be installed before running the inference module.

python -m spacy download en_core_web_sm

By default, CommonsenseInference module uses en_core_web_sm to initialize spacy pipeline, but a different language pipeline can be specified as well.

Evaluation

If you also would like evaluate knowledge models using METEOR score, then you need to download the following nltk libraries:

import nltk

nltk.download("punkt")
nltk.download("wordnet")
nltk.download("omw-1.4")

Quickstart

kogito provides an easy interface to interact with knowledge inference or commonsense reasoning models such as COMET to generate inferences from a text input. Here is a sample usage of the library where you can initialize an inference module, a custom commonsense reasoning model, and generate a knowledge graph from text on the fly.

from kogito.models.bart.comet import COMETBART
from kogito.inference import CommonsenseInference

# Load pre-trained model from HuggingFace
model = COMETBART.from_pretrained("mismayil/comet-bart-ai2")

# Initialize inference module with a spacy language pipeline
csi = CommonsenseInference(language="en_core_web_sm")

# Run inference
text = "PersonX becomes a great basketball player"
kgraph = csi.infer(text, model)

# Save output knowledge graph to JSON file
kgraph.to_jsonl("kgraph.json")

Here is an excerpt from the result of the above code sample:

{"head": "PersonX becomes a great basketball player", "relation": "Causes", "tails": [" PersonX practices every day.", " PersonX plays basketball every day", " PersonX practices every day"]}
{"head": "basketball", "relation": "ObjectUse", "tails": [" play with friends", " play basketball with", " play basketball"]}
{"head": "player", "relation": "CapableOf", "tails": [" play game", " win game", " play football"]}
{"head": "great basketball player", "relation": "HasProperty", "tails": [" good at basketball", " good at sports", " very good"]}
{"head": "become player", "relation": "isAfter", "tails": [" play game", " become coach", " play with"]}

This is just one way to generate commonsense inferences and kogito offers much more. For complete documentation, check out the kogito docs.

Development

Setup

kogito uses Poetry to manage its dependencies.

Install poetry from the official repository first:

curl -sSL https://install.python-poetry.org | python3 -

Then run the following command to install package dependencies:

poetry install

Data

If you need the ATOMIC2020 data to train your knowledge models, you can download it from AI2:

For ATOMIC:

wget https://storage.googleapis.com/ai2-mosaic/public/atomic/v1.0/atomic_data.tgz

For ATOMIC 2020:

wget https://ai2-atomic.s3-us-west-2.amazonaws.com/data/atomic2020_data-feb2021.zip

Paper

If you want to learn more about the library design, models and data used for this toolkit, check out our paper. The paper can be cited as:

@article{Ismayilzada2022kogito,
  title={kogito: A Commonsense Knowledge Inference Toolkit},
  author={Mete Ismayilzada and Antoine Bosselut},
  journal={ArXiv},
  volume={abs/2211.08451},
  year={2022}
}

If you work with knowledge models, consider citing the following papers:

@article{Hwang2020COMETATOMIC,
 author = {Jena D. Hwang and Chandra Bhagavatula and Ronan Le Bras and Jeff Da and Keisuke Sakaguchi and Antoine Bosselut and Yejin Choi},
 booktitle = {Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI)},
 title = {COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs},
 year = {2021}
}

@inproceedings{Bosselut2019COMETCT,
 author = {Antoine Bosselut and Hannah Rashkin and Maarten Sap and Chaitanya Malaviya and Asli Çelikyilmaz and Yejin Choi},
 booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)},
 title = {COMET: Commonsense Transformers for Automatic Knowledge Graph Construction},
 year = {2019}
}

Acknowledgements

Significant portion of the model training and evaluation code has been adapted from the original codebase for the paper (Comet-) Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kogito-0.6.2.tar.gz (4.3 MB view details)

Uploaded Source

Built Distribution

kogito-0.6.2-py3-none-any.whl (4.3 MB view details)

Uploaded Python 3

File details

Details for the file kogito-0.6.2.tar.gz.

File metadata

  • Download URL: kogito-0.6.2.tar.gz
  • Upload date:
  • Size: 4.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.8 Darwin/22.1.0

File hashes

Hashes for kogito-0.6.2.tar.gz
Algorithm Hash digest
SHA256 ba33ea70b6fa5e8a1227b9a5e51f549d3c35ada1712a1ba9ed6c1ba1374dfa07
MD5 faac292f0aece00282c33f7aae10ac11
BLAKE2b-256 e0e750d91cba2bb965eb1ffcbfd88b7f5437df17e6ed44dd37b4684b2959bd12

See more details on using hashes here.

File details

Details for the file kogito-0.6.2-py3-none-any.whl.

File metadata

  • Download URL: kogito-0.6.2-py3-none-any.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.10.8 Darwin/22.1.0

File hashes

Hashes for kogito-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0d04a7277cdfab25d7069e4cb4f80203cc8cf3fae0ec632cb5e0513dd67c0b13
MD5 0b607f9286d74e6836a3ba92bf74b19a
BLAKE2b-256 d7e3196e05b11efe53ff7cd55acb4d513bfb55550ea2caa5951b2007a93515da

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page