Skip to main content

Python Vietnamese Toolkit

Project description

Python Vietnamese Toolkit

Functionality

  • Tokenize

  • POS tag

  • Remove accents

Algorithm: Conditional Random Field

Vietnamese tokenizer f1_score = 0.978637686

Vietnamese pos tagging f1_score = 0.92520656

POS TAGS:

  • A - Adjective

  • C - Coordinating conjunction

  • E - Preposition

  • I - Interjection

  • L - Determiner

  • M - Numeral

  • N - Common noun

  • Nc - Noun Classifier

  • Ny - Noun abbreviation

  • Np - Proper noun

  • Nu - Unit noun

  • P - Pronoun

  • R - Adverb

  • S - Subordinating conjunction

  • T - Auxiliary, modal words

  • V - Verb

  • X - Unknown

  • F - Filtered out (punctuation)

Installation

At the command line with pip

$ pip install pyvi

Uninstall

$ pip uninstall pyvi

Usage

from pyvi import ViTokenizer, ViPosTagger

ViTokenizer.tokenize(u"Trường đại học bách khoa hà nội")

ViPosTagger.postagging(ViTokenizer.tokenize(u"Trường đại học Bách Khoa Hà Nội")

from pyvi import ViUtils
ViUtils.remove_accents(u"Trường đại học bách khoa hà nội")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

pyvi-0.0.9.6-py2.py3-none-any.whl (5.3 MB view details)

Uploaded Python 2 Python 3

File details

Details for the file pyvi-0.0.9.6-py2.py3-none-any.whl.

File metadata

  • Download URL: pyvi-0.0.9.6-py2.py3-none-any.whl
  • Upload date:
  • Size: 5.3 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/38.5.1 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.6.3

File hashes

Hashes for pyvi-0.0.9.6-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 64eafc46ecfa6de234d353af44d7e110796672ef6c06d6d17507179da0835c9f
MD5 578933633a98d5baa580aef8dfa044cf
BLAKE2b-256 e47100402ae910e62cb4e199e8a9173f9b5e81bec961150ad3b4d65b3027ad44

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page