Skip to main content

A Python Wrapper for VnCoreNLP

Project description

Table of contents

  1. Prerequisites
  2. Installation
  3. Usage for Python users

py_vncorenlp: A Python Wrapper for VnCoreNLP

Prerequisites

Installation

  • To install this python wrapper for VnCoreNLP, users have to run the following command:

    $ pip install py_vncorenlp

Example usage

import py_vncorenlp

# Automatically download VnCoreNLP models from the original repository
# and save them in some local machine folder
py_vncorenlp.download_model(save_dir='./vncorenlp')

# Load VnCoreNLP from the local machine folder containing the VnCoreNLP models
model = py_vncorenlp.VnCoreNLP(annotators=["wseg", "pos", "ner", "parse"], save_dir='./vncorenlp')
# Equivalent to: model = py_vncorenlp.VnCoreNLP(save_dir='./vncorenlp')

# Annotate a raw corpus
model.annotate_file(input_file="path_to_input_file", output_file="path_to_output_file")

# Annotate a raw sentence
model.print_out(model.annotate_text("Ông Nguyễn Khắc Chúc  đang làm việc tại Đại học Quốc gia Hà Nội."))

By default, the output is formatted with 6 columns representing word index, word form, POS tag, NER label, head index of the current word and its dependency relation type:

1       Ông     Nc      O       4       sub
2       Nguyễn_Khắc_Chúc        Np      B-PER   1       nmod
3       đang    R       O       4       adv
4       làm_việc        V       O       0       root
5       tại     E       O       4       loc
6       Đại_học N       B-ORG   5       pob
7       Quốc_gia        N       I-ORG   6       nmod
8       Hà_Nội  Np      I-ORG   6       nmod
9       .       CH      O       4       punct

In addition, for users who use only VnCoreNLP for word segmentation:

rdrsegmenter = py_vncorenlp.VnCoreNLP(annotators=["wseg"], save_dir='./vncorenlp')
sentence = "Ông Nguyễn Khắc Chúc  đang làm việc tại Đại học Quốc gia Hà Nội."
output = rdrsegmenter.word_segment(sentence)
print(output)
# "Ông Nguyễn_Khắc_Chúc đang làm_việc tại Đại_học Quốc_gia Hà_Nội ."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_vncorenlp-0.1.3.tar.gz (3.9 kB view details)

Uploaded Source

File details

Details for the file py_vncorenlp-0.1.3.tar.gz.

File metadata

  • Download URL: py_vncorenlp-0.1.3.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.28.0 setuptools/47.1.1.post20200604 requests-toolbelt/0.9.1 tqdm/4.63.0 CPython/3.7.7

File hashes

Hashes for py_vncorenlp-0.1.3.tar.gz
Algorithm Hash digest
SHA256 221b5783ed7b0137ece9b543120cc22f6f5bf97e63478a0b2d0fac54d98eedf7
MD5 cc092877cf1eb26c6eee737034c5aac4
BLAKE2b-256 1e123e88f2186bcc4311a77b7334dc3ebedf8d42c1bc1346142f3fa4384dc9bc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page