Skip to main content

Unofficial implementation of QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition.

Project description

tests linter

python 3.7 release (latest by date) license

pre-commit code style: black

pypi version pypi downloads

QaNER

Unofficial implementation of QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition.

You can adopt this pipeline for arbitrary BIO-markup data.

Installation

pip install qaner

CoNLL-2003

Pipeline results on CoNLL-2003 dataset:

How to use

Training

Script for training QaNER model:

qaner-train \
--bert_model_name 'bert-base-uncased' \
--path_to_prompt_mapper 'data/conll2003/prompt_mapper.json' \
--path_to_train_data 'data/conll2003/train.bio' \
--path_to_test_data 'data/conll2003/test.bio' \
--path_to_save_model 'dayyass/qaner-conll-bert-base-uncased' \
--n_epochs 2 \
--batch_size 128 \
--learning_rate 1e-5 \
--seed 42 \
--log_dir 'runs/qaner'

Required arguments:

  • --bert_model_name - base bert model for QaNER fine-tuning
  • --path_to_prompt_mapper - path to prompt mapper json file
  • --path_to_train_data - path to train data (BIO-markup)
  • --path_to_test_data - path to test data (BIO-markup)
  • --path_to_save_model - path to save trained QaNER model
  • --n_epochs - number of epochs to fine-tune
  • --batch_size - batch size
  • --learning_rate - learning rate

Optional arguments:

  • --seed - random seed for reproducibility (default: 42)
  • --log_dir - tensorboard log_dir (default: 'runs/qaner')

Infrerence

Script for inference trained QaNER model:

qaner-inference \
--context 'EU rejects German call to boycott British lamb .' \
--question 'What is the organization?' \
--path_to_prompt_mapper 'data/conll2003/prompt_mapper.json' \
--path_to_trained_model 'dayyass/qaner-conll-bert-base-uncased' \
--n_best_size 1 \
--max_answer_length 100 \
--seed 42

Result:

question: What is the organization?

context: EU rejects German call to boycott British lamb .

answer: [Span(token='EU', label='ORG', start_context_char_pos=0, end_context_char_pos=2)]

Required arguments:

  • --context - sentence to extract entities from
  • --question - question prompt with entity name to extract (examples below)
  • --path_to_prompt_mapper - path to prompt mapper json file
  • --path_to_trained_model - path to trained QaNER model
  • --n_best_size - number of best QA answers to consider

Optional arguments:

  • --max_answer_length - entity max length to eliminate very long entities (default: 100)
  • --seed - random seed for reproducibility (default: 42)

Possible inference questions for CoNLL-2003:

  • What is the location? (LOC)
  • What is the person? (PER)
  • What is the organization? (ORG)
  • What is the miscellaneous entity? (MISC)

Requirements

Python >= 3.7

Citation

@misc{liu2022qaner,
    title         = {QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition},
    author        = {Andy T. Liu and Wei Xiao and Henghui Zhu and Dejiao Zhang and Shang-Wen Li and Andrew Arnold},
    year          = {2022},
    eprint        = {2203.01543},
    archivePrefix = {arXiv},
    primaryClass  = {cs.LG}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qaner-0.1.1.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

qaner-0.1.1-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file qaner-0.1.1.tar.gz.

File metadata

  • Download URL: qaner-0.1.1.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.7.5

File hashes

Hashes for qaner-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d8e4f0f95e40b5041dd86eee2bb29d144a7cdafbb4556d859a77431cb67c538b
MD5 6c7ffdaba254fc09742034937f2b0ca9
BLAKE2b-256 452cc6ed17d088dd7860aa33fe671dadfbe102c47eaf4ef578b65159df396a55

See more details on using hashes here.

File details

Details for the file qaner-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: qaner-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.7.5

File hashes

Hashes for qaner-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c7a107579016e40b2289daa7b3c7df963963107b343e233085e91dfe410556fb
MD5 af4adab9f99c7795b59b47200b7329c1
BLAKE2b-256 333a92f2b5e5327327c7e8d90a3fcf34eb2278ae692b24e17640ed24b85d4b55

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page