Answering multiple choice questions with Language Models.

Project description

mcQA : Multiple Choice Questions Answering

Answering multiple choice questions with Language Models.

GitHub

Installation

With pip

pip install mcqa

From source

git clone https://github.com/mcqa-suite/mcqa.git
cd mcQA
pip install -e .

Getting started

Data preparation

To train a mcQA model, you need to create a csv file with n+2 columns, n being the number of choices for each question. The first column should be the context sentence, the n following columns should be the choices for that question and the last column is the selected answer.

Below is an example of a 3 choice question (taken from the CoS-E dataset) :

Context sentence	Choice 1	Choice 2	Choice 3	Label
People do what during their time off from work?	take trips	brow shorter	become hysterical	take trips

If you have a trained mcQA model and want to infer on a dataset, it should have the same format as the train data, but the label column.

See example data preparation below:

from mcqa.data import MCQAData

mcqa_data = MCQAData(bert_model="bert-base-uncased", 
                     lower_case=True, 
                     max_seq_length=256) 
                     
train_dataset = mcqa_data.read(data_file='swagaf/data/train.csv', is_training=True)
test_dataset = mcqa_data.read(data_file='swagaf/data/test.csv', is_training=False)

Model training

from mcqa.models import Model

mdl = Model(bert_model="bert-base-uncased",
            device="cuda") 
            
mdl.fit(train_dataset, 
        train_batch_size=32, 
        num_train_epochs=20)

Prediction

preds = mdl.predict(test_dataset, 
                    eval_batch_size=32)

Evaluation

from sklearn.metrics import accuracy_score
from mcqa.data import get_labels

print(accuracy_score(preds, get_labels(train_dataset)))

References

Type	Title	Author	Year
:newspaper: Paper	Explain Yourself! Leveraging Language Models for Commonsense Reasoning	Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong and Richard Socher	ACL 2019
:newspaper: Paper	SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference	Rowan Zellers, Yonatan Bisk, Roy Schwartz and Yejin Choi	2018

Project details

Release history Release notifications | RSS feed

This version

0.1.1

Aug 16, 2019

0.1.0

Jul 26, 2019

0.0.1

Jul 15, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcqa-0.1.1.tar.gz (15.6 kB view hashes)

Uploaded Aug 16, 2019 Source

Hashes for mcqa-0.1.1.tar.gz

Hashes for mcqa-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`7317fcf33d232cf42f70a097d9847e4f77433329188bff7635491387dcf5af13`
MD5	`2664ce9f09c18ae93148eaa7fa582fa2`
BLAKE2b-256	`54a792e02e08b640c6b4f339b249211688fc52b09ef32efb603527b766e4f0f2`