Python Vietnamese Toolkit
Project description
Pyvi performs tokenizing / pos-tagging for Vietnamese in Python.
Algorithm: Conditional Random Field
Vietnamese tokenizer f1_score = 0.978637686
Vietnamese pos tagging f1_score = 0.92520656
POS TAGS:
A - Adjective
C - Coordinating conjunction
E - Preposition
I - Interjection
L - Determiner
M - Numeral
N - Common noun
Nc - Noun Classifier
Ny - Noun abbreviation
Np - Proper noun
Nu - Unit noun
P - Pronoun
R - Adverb
S - Subordinating conjunction
T - Auxiliary, modal words
V - Verb
X - Unknown
F - Filtered out (punctuation)
Installation
At the command line with pip
$ pip install pyvi
Uninstall
$ pip uninstall pyvi
Usage
from pyvi import ViTokenizer, ViPosTagger
ViTokenizer.tokenize(u"Trường đại học bách khoa hà nội")
ViPosTagger.postagging(ViTokenizer.tokenize(u"Trường đại học Bách Khoa Hà Nội")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyvi-0.0.9.2.tar.gz
(5.2 MB
view details)
File details
Details for the file pyvi-0.0.9.2.tar.gz.
File metadata
- Download URL: pyvi-0.0.9.2.tar.gz
- Upload date:
- Size: 5.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/38.5.1 requests-toolbelt/0.8.0 tqdm/4.19.9 CPython/3.6.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db56ed20f39fdf820bad5a6fa7159620db9eef5d9b1a532de106f744b0d4653e
|
|
| MD5 |
4a87cd4b5aad6952651e37e6c3966749
|
|
| BLAKE2b-256 |
4991ec00d4034ec22d65dfdfa807918d7ebb9e7e3fe6c26551d1d5df7ac4e410
|