Python Vietnamese Toolkit
Project description
This tool makes it easy to do tokenizing / pos-tagging Vietnamese with Python.
Algorithm: Conditional Random Field Vietnamese tokenizer f1_score = 0.978637686 Vietnamese pos tagging f1_score = 0.92520656
POS TAGS: A - Adjective C - Coordinating conjunction E - Preposition I - Interjection L - Determiner M - Numeral N - Common noun Nc - Noun Classifier Ny - Noun abbreviation Np - Proper noun Nu - Unit noun P - Pronoun R - Adverb S - Subordinating conjunction T - Auxiliary, modal words V - Verb X - Unknown F - Filtered out (punctuation)
Installation
At the command line with pip
$ pip install pyvi
Uninstall
$ pip uninstall pyvi
Usage
from pyvi.pyvi import ViTokenizer, ViPosTagger
ViTokenizer.tokenize(u"Trường đại học bách khoa hà nội")
ViPosTagger.postagging(ViTokenizer.tokenize(u"Trường đại học Bách Khoa Hà Nội")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pyvi-0.0.8.2.tar.gz
.
File metadata
- Download URL: pyvi-0.0.8.2.tar.gz
- Upload date:
- Size: 5.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c11637f9b87b243c7e550715efd9ce94683a2dbfcccf235fac9f5b6c0f51b2dc |
|
MD5 | 7bb34121be667a3dd73096e812bf17bb |
|
BLAKE2b-256 | b1b389f12862018e1c7e8aa50ff055021d3a418c37cfef4aa2f9078225ad09e3 |