Skip to main content

Nested Named Entity Recognition

Project description

model name

xxx is our method used in CBLUE (Chinese Biomedical Language Understanding Evaluation), a benchmark of Nested Named Entity Recognition. We got the 2nd price of the benchmark by 2021/12/06.

Approach

TODO:

picture or paper

Usage

First, install PyTorch>=1.7.0. There's no restriction on GPU or CUDA.

Then, install this repo as a Python package:

$ pip install nner

API

The nner package provides the following methods:

nner.load_NNER(model_save_path='./checkpoint/macbert-large_dict.pth', maxlen=512, c_size=9, id2c=_id2c)

Returns the pretrained model. It will download the model as necessary. The model would use the first CUDA device if there's any, otherwise using CPU instead.

The model_save_path argument specifies the path of the pretrained model weight.

The maxlen argument specifies the max length of input sentences. The sentences longer than maxlen would be cut off.

The c_size argument specifies the number of entity class. Here is 9 for CBLUE.

The id2c argument specifies the mapping between id and entity class. By default, the id2c argument for CBLUE is:

_id2c = {0: 'dis', 1: 'sym', 2: 'pro', 3: 'equ', 4: 'dru', 5: 'ite', 6: 'bod', 7: 'dep', 8: 'mic'}


The model returned by nner.load_NNER() supports the following methods:

model.recognize(text: str, threshold=0)

Given a sentence, returns a list of tuples with recognized entity, the format of the tuple is [(start_index, end_index, entity_class), ...]. The threshold argument specifies that the returned list only contains the recognized entity with confidence score higher than threshold.

model.predict_to_file(in_file: str, out_file: str)

Given input and output .json file path, the model would do inference according in_file, and the recognized entity would be saved in out_file. The output file can be submitted to CBLUE. The format of input file is like:

[
  {
    "text": "..."
  },
  {
    "text": "..."
  },
  ...
]

Examples

import nner

NNER = nner.load_NNER()
in_file = './CMeEE_test.json'
out_file = './CMeEE_test_answer.json'
NNER.predict_to_file(in_file, out_file)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nner-0.1.3.tar.gz (12.3 kB view details)

Uploaded Source

File details

Details for the file nner-0.1.3.tar.gz.

File metadata

  • Download URL: nner-0.1.3.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.5.0.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for nner-0.1.3.tar.gz
Algorithm Hash digest
SHA256 1d9ac50266feed0cbc31f33cc7b3667f1a42b2bb4629754795e513bc34987b0e
MD5 e4ffa000a72a7939b053415fd2ddde92
BLAKE2b-256 16be19b130b28469cc252b0be6a800d675421ddeb5b7783826940de9bac910f4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page