Skip to main content

Question-Answering system using state-of-the-art pre-trained language models.

Project description

BERT-QA

Build question-answering systems using state-of-the-art pre-trained contextualized language models, e.g. BERT. We are working to accelerate the development of question-answering systems based on BERT and TF 2.0!

Background

This project is based on our study: Question Generation by Transformers.

Citation

To cite this work, use the following BibTeX citation.

@article{question-generation-transformers@2019,
  title={Question Generation by Transformers},
  author={Kriangchaivech, Kettip and Wangperawong, Artit},
  journal={arXiv preprint arXiv:1909.05017},
  year={2019}
}

Requirements

TensorFlow 2.0 will be installed if not already on your system

Installation

pip install bert_qa

Example usage

Run Colab demo notebook here.

download pre-trained models and SQuAD data

wget -q https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/uncased_L-12_H-768_A-12.tar.gz
tar -xvzf uncased_L-12_H-768_A-12.tar.gz
mv -f home/hongkuny/public/pretrained_models/keras_bert/uncased_L-12_H-768_A-12 .

download SQuAD data

wget -q https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json
wget -q https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json

import, initialize, pre-process data, finetune, and predict!

from bert_qa import squad
qa = squad.SQuAD()
qa.preprocess_training_data()
qa.fit()
predictions = qa.predict()

evaluate

import json
pred_data = json.load(open('model/predictions.json'))
dev_data = json.load(open('dev-v1.1.json'))['data']
qa.evaluate(dev_data, pred_data)

Advanced usage

Model type

The default model is an uncased Bidirectional Encoder Representations from Transformers (BERT) consisting of 12 transformer layers, 12 self-attention heads per layer, and a hidden size of 768. Below are all models currently supported that you can specify with hub_module_handle. We expect that more will be added in the future. For more information, see TensorFlow's BERT GitHub.

Contributing

BERT-QA is an open-source project founded and maintained to better serve the machine learning and data science community. Please feel free to submit pull requests to contribute to the project. By participating, you are expected to adhere to BERT-QA's code of conduct.

Questions?

For questions or help using BERT-QA, please submit a GitHub issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bert_qa-0.1.0.tar.gz (89.6 kB view details)

Uploaded Source

Built Distribution

bert_qa-0.1.0-py3-none-any.whl (116.7 kB view details)

Uploaded Python 3

File details

Details for the file bert_qa-0.1.0.tar.gz.

File metadata

  • Download URL: bert_qa-0.1.0.tar.gz
  • Upload date:
  • Size: 89.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.6.9

File hashes

Hashes for bert_qa-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3216610ed925587312dd5026017a0fc942c1492234fc730776e66fadbd85dffd
MD5 f76ebe0f065d1f297974e08ca864f209
BLAKE2b-256 1061c5118490029d02252d1ce2b12f4dab54dfe7a866c6920747aa877b1b2b23

See more details on using hashes here.

File details

Details for the file bert_qa-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bert_qa-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 116.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.6.9

File hashes

Hashes for bert_qa-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 02676857f647375b17b4216b7ebd2469671855054297d225af14f7ed685ba8ea
MD5 ed2767cb161d676c4f117c8b51cb284e
BLAKE2b-256 72aca384a63eb23aec4295f669514d4ac5dceea826c8e541c4c35c28141ea528

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page