Skip to main content

Python functions for working with the Thai language

Project description

PyThai
======

Some basic python functions for working with the Thai language. For example:

```python
import pythai

pythai.split(u"การที่ได้ต้องแสดงว่างานดี")
>>> u"การ ที่ ได้ ต้อง แสดง ว่า งาน ดี"

pythai.word_count(u"การที่ได้ต้องแสดงว่างานดี")
>>> 8

pythai.contains_thai(u"hello")
>>> False

pythai.contains_thai(u"helloการที่ไ")
>>> True
```

It's meant to be fast and efficient enough to handle large documents without breaking a sweat.

Includes
------------

Currently the library supports these functions:

- Word segmentation (`split`)
- Word count (`word_count`) (faster than counting the result of `split`)
- Whether a string contains Thai or not (`contains_thai`)


Installation
------------

PyThai equires `thailib` to work. You can install it quite easily:

sudo apt-get install thailib

And then you can simply install `pythai` through **pip**:

pip install pythai

More
------------

Special thanks to Vee Satayamas for the original python bindings of libthai from C.

This library was written for use in [Gengo](http://www.gengo.com). It's free and open-source under the GNU lesser public license. Any contributions are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pythai, version 0.1.3
Filename, size File type Python version Upload date Hashes
Filename, size pythai-0.1.3.tar.gz (13.6 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page