Python Vietnamese Toolkit
Project description
This tool makes it easy to do tokenizing / pos-tagging Vietnamese with Python.
Algorithm: Conditional Random Field Vietnamese tokenizer f1_score = 0.978637686 Vietnamese pos tagging f1_score = 0.92520656
POS TAGS: A - Adjective C - Coordinating conjunction E - Preposition I - Interjection L - Determiner M - Numeral N - Common noun Nc - Noun Classifier Ny - Noun abbreviation Np - Proper noun Nu - Unit noun P - Pronoun R - Adverb S - Subordinating conjunction T - Auxiliary, modal words V - Verb X - Unknown F - Filtered out (punctuation)
Installation
At the command line with pip
$ pip install pyvi
Uninstall
$ pip uninstall pyvi
Usage
from pyvi.pyvi import ViTokenizer, ViPosTagger
ViTokenizer.tokenize(u"Trường đại học bách khoa hà nội")
ViPosTagger.postagging(ViTokenizer.tokenize(u"Trường đại học Bách Khoa Hà Nội")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyvi-0.0.7.5.tar.gz
(3.0 MB
view hashes)
Built Distribution
Close
Hashes for pyvi-0.0.7.5-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6a3d93cc4222f2852c9fb2c76cf272dadd5743cee1f3cc93d0b4aef8562e850 |
|
MD5 | 8aa666c6f96fbecdc02a39f061bd3d74 |
|
BLAKE2b-256 | 001deeddaf4b564dda3fe7ce4061cd850e1acf0e92926ab9e7de2d99ae528173 |