Skip to main content

Vietnamese NLP Toolkit

Project description

Under The Sea - Vietnamese NLP Toolkit

https://img.shields.io/pypi/v/underthesea.svg https://img.shields.io/travis/magizbox/underthesea.svg Documentation Status Updates https://raw.githubusercontent.com/magizbox/underthesea/master/logo.jpg

Features

1. Corpus

https://img.shields.io/badge/documents-18k-red.svg https://img.shields.io/badge/words-74k-red.svg

Collection of Vietnamese corpus

2. Word Segmentation

https://img.shields.io/badge/F1-97%25-red.svg

Vietnamese Word Segmentation using conditional random fields

Up Coming Features

History

1.0.18 (2017-05-24)

  • Fix word_sent method

  • Enhance performance

  • Add word_sent package

1.0.9 (2017-03-07)

  • Add Corpus class

  • Add Transformer classes

  • Integrated with dictionary of Ho Ngoc Duc

  • Add travis-CI

  • Auto build with PyPI

1.0.0 (2017-03-01)

  • First release on PyPI.

  • First release on Readthedocs

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

underthesea-1.0.18.tar.gz (3.2 MB view hashes)

Uploaded Source

Built Distribution

underthesea-1.0.18-py2.py3-none-any.whl (2.9 MB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page