Skip to main content

Python functions for working with the Thai language

Project description

PyThai
======

Some basic python functions for working with the Thai language. For example:

```python
import pythai

pythai.split(u"การที่ได้ต้องแสดงว่างานดี")
>>> u"การ ที่ ได้ ต้อง แสดง ว่า งาน ดี"

pythai.word_count(u"การที่ได้ต้องแสดงว่างานดี")
>>> 8

pythai.contains_thai(u"hello")
>>> False

pythai.contains_thai(u"helloการที่ไ")
>>> True
```

It's meant to be fast and efficient enough to handle large documents without breaking a sweat.

Includes
------------

Currently the library supports these functions:

- Word segmentation (`split`)
- Word count (`word_count`) (faster than counting the result of `split`)
- Whether a string contains Thai or not (`contains_thai`)


Installation
------------

PyThai equires `thailib` to work. You can install it quite easily:

sudo apt-get install thailib

And then you can simply install `pythai` through **pip**:

pip install pythai

More
------------

Special thanks to Vee Satayamas for the original python bindings of libthai from C.

This library was written for use in [Gengo](http://www.gengo.com). It's free and open-source under the GNU lesser public license. Any contributions are welcome!

Project details


Release history Release notifications

This version
History Node

0.1.3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pythai-0.1.3.tar.gz (13.6 kB) Copy SHA256 hash SHA256 Source None Jun 1, 2013

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page