vi-core-nlp is a library that supports Vietnamese NER by pattern matching .

Project description

NER for Vietnamese Medical Appointment Chatbot

Usage

from vi_nlp_core.ner.extractor import Extractor

Extract person name

text = "tôi cần đặt bác sĩ tạ biên cương"
print(extractor.extract_person_name(text)

{'entities': [{'start': 19, 'end': 32, 'entity': 'person_name', 'value': 'Tạ Biên Cương', 'confidence': 1.0, 'extractor': 'pattern'}]}

Extract Date
the value is the timestamp value

text = "tôi sinh vào ngày 21-3-1997"
extractor.extract_date(text)

{'entities': [{'start': 0, 'end': 5, 'entity': 'time', 'value': 1628562600.0, 'confidence': 1.0, 'extractor': 'absolute_pattern'}]}

Extract Time

text = '14:50 ngày 7 tháng 6'
print(extractor.extract_time(text,return_value=True)) #return value only

{'entities': [{'start': 7, 'end': 9, 'entity': 'time', 'value': 1628560800.0, 'confidence': 1.0, 'extractor': 'absolute_pattern'}]}

Map department/gender to keys

text = 'rai'
res = extractor.map_gender_to_key(text)
print(res)
# {'key': 'GEN01', 'text': 'rai', 'value': 'trai'}
text = 'tiêu hóa'
res = extractor.map_dep_to_key(text)
print(res)
# {'key': 'SP008', 'text': 'tiêu hóa', 'value': 'tiêu hóa'}

From symptoms to Department def extract_symptoms(self, utterance, input_symptoms= None, input_dep_keys=None, get_dep_keys=False, top_k=3):

utterance : input string
input_symptoms (optional): list of symptoms (e.g: ['đau bụng', 'sốt', 'ho', 'ói', 'nôn'])
input_dep_keys (optional) : dict of department keys-keywords that you want to extract immediately
get_dep_keys (optional) : whether return list of (dep-keys,value) only\
top_k (optional) : return top_k answer (symptoms or dep_keys)

e.g:
    input_dict = {
        'A001': ['đau bụng', 'sốt', 'ho', 'ói', 'nôn', 'chóng mặt'],
        'A004': ['khó thở', 'đau ngực', 'sốt', 'nôn', 'ho'])
        )

Example :

Common usage

text = "dạo này tôi thấy trong người mệt mỏi, thần kinh căng thẳng do cách ly covid quá lâu"
res = extractor.extract_symptoms(text)

{'entities': [{'start': 29, 'end': 32, 'entity': 'symptom', 'value': 'mệt', 'confidence': 1.0, 'extractor': 'fuzzy_matching'}, {'start': 39, 'end': 48, 'entity': 'symptom', 'value': 'thần kinh', 'confidence': 1.0, 'extractor': 'fuzzy_matching'}, {'start': 49, 'end': 59, 'entity': 'symptom', 'value': 'căng thẳng', 'confidence': 1.0, 'extractor': 'fuzzy_matching'}]}

Extract Department Keys directly

res = extractor.extract_symptoms(text,get_dep_keys=True,top_k=3)

[('SP012', 1.0), ('SP001', 0.0), ('SP002', 0.0)]

Extracting with given list of symptoms

text = "dạo này tôi thấy trong người mệt mỏi, thần kinh căng thẳng do cách ly covid quá lâu"
res = extractor.extract_symptoms(text, input_symptoms=['mệt mỏi', 'covid'])

{'entities': [{'start': 71, 'end': 76, 'entity': 'symptom', 'value': 'covid', 'confidence': 1.0, 'extractor': 'fuzzy_matching'}, {'start': 29, 'end': 36, 'entity': 'symptom', 'value': 'mệt mỏi', 'confidence': 1.0, 'extractor': 'fuzzy_matching'}]}

Extracting with input_dep_keys

# CASE 1:
res = extractor.extract_symptoms(text, input_dep_keys=input_dict, get_dep_keys=True)

# CASE 2:
extractor.set_dep_symp_database(input_dict)
res = extractor.extract_symptoms(text, get_dep_keys=True)

Search department from list of symptoms

text = ['ho', 'sổ mũi', 'đau họng', 'đau đầu', 'nghẹt mũi']
res = extractor.get_department_from_symptoms(text,top_k=3)

[('SP018', 1.0), ('SP006', 0.6), ('SP009', 0.4)]

Project details

Release history Release notifications | RSS feed

1.1.12

Aug 25, 2021

1.1.11

Aug 16, 2021

This version

1.1.10

Aug 16, 2021

1.1.9

Aug 12, 2021

1.1.8

Aug 11, 2021

1.1.6

Aug 11, 2021

1.1.3

Aug 10, 2021

1.1.2

Aug 3, 2021

1.1.1

Aug 2, 2021

1.1.0

Aug 2, 2021

1.0.3

Jul 30, 2021

1.0.2

Jul 30, 2021

1.0.1

Jul 30, 2021

1.0.0

Jul 30, 2021

0.3.3

Jul 29, 2021

0.3.2

Jul 29, 2021

0.3.1

Jul 29, 2021

0.3

Jul 29, 2021

0.1

Jul 29, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vi_nlp_core-1.1.10.tar.gz (121.9 kB view hashes)

Uploaded Aug 16, 2021 Source

Built Distribution

vi_nlp_core-1.1.10-py3-none-any.whl (126.9 kB view hashes)

Uploaded Aug 16, 2021 Python 3

Hashes for vi_nlp_core-1.1.10.tar.gz

Hashes for vi_nlp_core-1.1.10.tar.gz
Algorithm	Hash digest
SHA256	`2ba2bfddf5d97bb4dce3899d3a08629ab6b146d12d863d7cd8abbfab71c67875`
MD5	`a5af3546c2fb5b1f1dd0207f028ada68`
BLAKE2b-256	`f942ebe51090bd2660e0164a19ef3749d96832d030c61ec7f907a27f546b9af5`

Hashes for vi_nlp_core-1.1.10-py3-none-any.whl

Hashes for vi_nlp_core-1.1.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bb748c55e22d6154d2dae8ccbd759534787e0a994f8b5c6523d8cc48fbefe991`
MD5	`7dbd768c9861947c9e00a070911a0084`
BLAKE2b-256	`403c5cc98c98540a6a7e1b882217a634c2ebe3441b5b6f34a45769b2978ff037`