A Python Wrapper for VnCoreNLP
Project description
Table of contents
py_vncorenlp: A Python Wrapper for VnCoreNLP
Prerequisites
Installation
-
To install this python wrapper for VnCoreNLP, users have to run the following command:
$ pip install py_vncorenlp
Example usage
import py_vncorenlp
# Automatically download VnCoreNLP models from the original repository
# and save them in some local machine folder
py_vncorenlp.download_model(save_dir='./vncorenlp')
# Load VnCoreNLP from the local machine folder containing the VnCoreNLP models
model = py_vncorenlp.VnCoreNLP(annotators=["wseg", "pos", "ner", "parse"], save_dir='./vncorenlp')
# Equivalent to: model = py_vncorenlp.VnCoreNLP(save_dir='./vncorenlp')
# Annotate a raw corpus
model.annotate_file(input_file="path_to_input_file", output_file="path_to_output_file")
# Annotate a raw sentence
model.print_out(model.annotate_text("Ông Nguyễn Khắc Chúc đang làm việc tại Đại học Quốc gia Hà Nội."))
By default, the output is formatted with 6 columns representing word index, word form, POS tag, NER label, head index of the current word and its dependency relation type:
1 Ông Nc O 4 sub
2 Nguyễn_Khắc_Chúc Np B-PER 1 nmod
3 đang R O 4 adv
4 làm_việc V O 0 root
5 tại E O 4 loc
6 Đại_học N B-ORG 5 pob
7 Quốc_gia N I-ORG 6 nmod
8 Hà_Nội Np I-ORG 6 nmod
9 . CH O 4 punct
In addition, for users who use only VnCoreNLP for word segmentation:
rdrsegmenter = py_vncorenlp.VnCoreNLP(annotators=["wseg"], save_dir='./vncorenlp')
sentence = "Ông Nguyễn Khắc Chúc đang làm việc tại Đại học Quốc gia Hà Nội."
output = rdrsegmenter.word_segment(sentence)
print(output)
# "Ông Nguyễn_Khắc_Chúc đang làm_việc tại Đại_học Quốc_gia Hà_Nội ."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
py_vncorenlp-0.1.2.tar.gz
(3.9 kB
view details)
File details
Details for the file py_vncorenlp-0.1.2.tar.gz
.
File metadata
- Download URL: py_vncorenlp-0.1.2.tar.gz
- Upload date:
- Size: 3.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.28.0 setuptools/47.1.1.post20200604 requests-toolbelt/0.9.1 tqdm/4.63.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c8407db2c57dd862335a4e8e7a44b4fb30b6ea3657cdf994a65cc9c2bf0d253e |
|
MD5 | 09d036aa9ee7294a2198e9e23dfee136 |
|
BLAKE2b-256 | 5f0eb88dfaea8e35462c9773334e6b46c0c9fce2938fce1ff9fc63964b956fcc |