Nested Named Entity Recognition
Project description
model name
xxx is our method used in CBLUE (Chinese Biomedical Language Understanding Evaluation), a benchmark of Nested Named Entity Recognition. We got the 2nd price of the benchmark by 2021/12/06.
Approach
TODO:
picture or paper
Usage
First, install PyTorch>=1.7.0. There's no restriction on GPU or CUDA.
Then, install this repo as a Python package:
$ pip install nner
API
The nner
package provides the following methods:
nner.load_NNER(model_save_path='./checkpoint/macbert-large_dict.pth', maxlen=512, c_size=9, id2c=_id2c)
Returns the pretrained model. It will download the model as necessary. The model would use the first CUDA device if there's any, otherwise using CPU instead.
The model_save_path
argument specifies the path of the pretrained model weight.
The maxlen
argument specifies the max length of input sentences. The sentences longer than maxlen
would be cut off.
The c_size
argument specifies the number of entity class. Here is 9
for CBLUE.
The id2c
argument specifies the mapping between id and entity class. By default, the id2c
argument for CBLUE is:
_id2c = {0: 'dis', 1: 'sym', 2: 'pro', 3: 'equ', 4: 'dru', 5: 'ite', 6: 'bod', 7: 'dep', 8: 'mic'}
The model returned by nner.load_NNER()
supports the following methods:
model.recognize(text: str, threshold=0)
Given a sentence, returns a list of tuples with recognized entity, the format of the tuple is [(start_index, end_index, entity_class), ...]
. The threshold
argument specifies that the returned list only contains the recognized entity with confidence score higher than threshold
.
model.predict_to_file(in_file: str, out_file: str)
Given input and output .json
file path, the model would do inference according in_file
, and the recognized entity would be saved in out_file
. The output file can be submitted to CBLUE. The format of input file is like:
[
{
"text": "..."
},
{
"text": "..."
},
...
]
Examples
import nner
NNER = nner.load_NNER()
in_file = './CMeEE_test.json'
out_file = './CMeEE_test_answer.json'
NNER.predict_to_file(in_file, out_file)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file nner-0.1.3.tar.gz
.
File metadata
- Download URL: nner-0.1.3.tar.gz
- Upload date:
- Size: 12.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.5.0.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d9ac50266feed0cbc31f33cc7b3667f1a42b2bb4629754795e513bc34987b0e |
|
MD5 | e4ffa000a72a7939b053415fd2ddde92 |
|
BLAKE2b-256 | 16be19b130b28469cc252b0be6a800d675421ddeb5b7783826940de9bac910f4 |