Skip to main content

Answering multiple choice questions with Language Models.

Project description

mcQA : Multiple Choice Questions Answering

Answering multiple choice questions with Language Models.

CircleCI PyPI Version GitHub codecov PRs Welcome

Installation

With pip

pip install mcqa

From source

git clone https://github.com/mcqa-suite/mcqa.git
cd mcQA
pip install -e .

Getting started

Data preparation

To train a mcQA model, you need to create a csv file with n+2 columns, n being the number of choices for each question. The first column should be the context sentence, the n following columns should be the choices for that question and the last column is the selected answer.

Below is an example of a 3 choice question (taken from the CoS-E dataset) :

Context sentence Choice 1 Choice 2 Choice 3 Label
People do what during their time off from work? take trips brow shorter become hysterical take trips

If you have a trained mcQA model and want to infer on a dataset, it should have the same format as the train data, but the label column.

See example data preparation below:

from mcqa.data import MCQAData

mcqa_data = MCQAData(bert_model="bert-base-uncased", 
                     lower_case=True, 
                     max_seq_length=256) 
                     
train_dataset = mcqa_data.read(data_file='swagaf/data/train.csv', is_training=True)
test_dataset = mcqa_data.read(data_file='swagaf/data/test.csv', is_training=False)

Model training

from mcqa.models import Model

mdl = Model(bert_model="bert-base-uncased",
            device="cuda") 
            
mdl.fit(train_dataset, 
        train_batch_size=32, 
        num_train_epochs=20)

Prediction

preds = mdl.predict(test_dataset, 
                    eval_batch_size=32)

Evaluation

from sklearn.metrics import accuracy_score
from mcqa.data import get_labels

print(accuracy_score(preds, get_labels(train_dataset)))

References

Type Title Author Year
:newspaper: Paper Explain Yourself! Leveraging Language Models for Commonsense Reasoning Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong and Richard Socher ACL 2019
:newspaper: Paper SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference Rowan Zellers, Yonatan Bisk, Roy Schwartz and Yejin Choi 2018

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcqa-0.1.1.tar.gz (15.6 kB view details)

Uploaded Source

File details

Details for the file mcqa-0.1.1.tar.gz.

File metadata

  • Download URL: mcqa-0.1.1.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.1

File hashes

Hashes for mcqa-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7317fcf33d232cf42f70a097d9847e4f77433329188bff7635491387dcf5af13
MD5 2664ce9f09c18ae93148eaa7fa582fa2
BLAKE2b-256 54a792e02e08b640c6b4f339b249211688fc52b09ef32efb603527b766e4f0f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page