Thai Natural Language Processing in Python.
Project description
PyThaiNLP 1.7
PyThaiNLP is a Python library for natural language processing (NLP) of Thai language.
PyThaiNLP features include Thai word and subword segmentations, soundex, romanization, part-of-speech taggers, and spelling corrections.
What's new in version 1.7 ?
- Deprecate Python 2 support
- Refactor pythainlp.tokenize.pyicu for readability
- Add Thai NER model to pythainlp.ner
- thai2vec v0.2 - larger vocab, benchmarking results on Wongnai dataset
- Sentiment classifier based on ULMFit and various product review datasets
- Add ULMFit utility to PyThaiNLP
- Add Thai romanization model thai2rom
- Retrain POS-tagging model
- Improved word_tokenize (newmm, mm) and dict_word_tokenize
- Documentation added
Install
pip install pythainlp
Note for Windows: marisa-trie
wheels can be obtained from https://www.lfd.uci.edu/~gohlke/pythonlibs/#marisa-trie
Install it with pip, for example: pip install marisa_trie‑0.7.5‑cp36‑cp36m‑win32.whl
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
pythainlp-1.7.4-py3-none-any.whl
(10.3 MB
view details)
File details
Details for the file pythainlp-1.7.4-py3-none-any.whl
.
File metadata
- Download URL: pythainlp-1.7.4-py3-none-any.whl
- Upload date:
- Size: 10.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f1bcfc44ed76eb1edf033772f2273aeda51b2129120398bdef29fc428359770 |
|
MD5 | 6e032f121f25caec9112bf7a5d75ec00 |
|
BLAKE2b-256 | 3201c48fd12dfe7f3ccf8795c657aa8c7961784c58f71c3b7e4f895723fd88b9 |