Skip to main content

Thai Natural Language Processing library

Project description

PyThaiNLP Logo

PyThaiNLP 2.0.3

PyThaiNLP is a Python library for natural language processing (NLP) of Thai language.

PyThaiNLP includes Thai word tokenizers, transliterators, soundex converters, part-of-speech taggers, and spell checkers.

📫 follow us on Facebook PyThaiNLP

What's new in version 2.0 ?

  • New NorvigSpellChecker spell checker class, which can be initialized with custom dictionary.
  • Terminate Python 2 support. Remove all Python 2 compatibility code.
  • Remove old, obsolated, deprecated, and experimental code.
  • Thai2fit (Upgrade ULMFiT-related codes to fastai 1.0)
  • ThaiNER 1.0
  • Remove sentiment analysis
  • Improved word_tokenize (newmm, mm) and dict_word_tokenize
  • Improved POS-tagging
  • See examples in Get Started notebook
  • Full change log
  • Upgrading from 1.7
  • Upgrade ThaiNER from 1.7

Install

For stable version:

pip install pythainlp

For some advanced functionalities, like word vector, extra packages may be needed. Install them with these options during pip install:

pip install pythainlp[extra1,extra2,...]

where extras can be

  • artagger (to support artagger part-of-speech tagger)*
  • deepcut (to support deepcut machine-learnt tokenizer)
  • icu (for ICU support in transliteration and tokenization)
  • ipa (for International Phonetic Alphabet support in transliteration)
  • ml (to support fastai 1.0.22 ULMFiT models)
  • ner (for named-entity recognizer)
  • thai2fit (for Thai word vector)
  • thai2rom (for machine-learnt romanization)
  • full (install everything)

Note for Windows: marisa-trie wheels can be obtained from https://www.lfd.uci.edu/~gohlke/pythonlibs/#marisa-trie Install it with pip, for example: pip install marisa_trie‑0.7.5‑cp36‑cp36m‑win32.whl

Links

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pythainlp-2.0.3.tar.gz (53.9 kB view details)

Uploaded Source

Built Distribution

pythainlp-2.0.3-py3-none-any.whl (11.2 MB view details)

Uploaded Python 3

File details

Details for the file pythainlp-2.0.3.tar.gz.

File metadata

  • Download URL: pythainlp-2.0.3.tar.gz
  • Upload date:
  • Size: 53.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.19.1 setuptools/41.0.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.3

File hashes

Hashes for pythainlp-2.0.3.tar.gz
Algorithm Hash digest
SHA256 d56d036c48a9ee1e437d63e35ab51e9f175877fac58e7d01c1c457bb1e0cb7e6
MD5 97e2e3ea7b47dff4710a2935a80e5927
BLAKE2b-256 feb44b8cb79f6b4ce3752b50146f484763e8578958f5cbb883b1fd376586aab2

See more details on using hashes here.

File details

Details for the file pythainlp-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: pythainlp-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 11.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.19.1 setuptools/41.0.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.3

File hashes

Hashes for pythainlp-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 89559783f731df22dc9f14c2d6ab4399475c6e34bd6f9137184595e230a4d113
MD5 167365de299deed0b99c6eb9d6300e04
BLAKE2b-256 37ac2707228a45b95743409159688e4764efc13f2a7d247a86e75213d548d433

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page