Skip to main content

Python functions for working with the Thai language

Project description

PyThai
======

Some basic python functions for working with the Thai language. For example:

```python
import pythai

pythai.split(u"การที่ได้ต้องแสดงว่างานดี")
>>> u"การ ที่ ได้ ต้อง แสดง ว่า งาน ดี"

pythai.word_count(u"การที่ได้ต้องแสดงว่างานดี")
>>> 8

pythai.contains_thai(u"hello")
>>> False

pythai.contains_thai(u"helloการที่ไ")
>>> True
```

It's meant to be fast and efficient enough to handle large documents without breaking a sweat.

Includes
------------

Currently the library supports these functions:

- Word segmentation (`split`)
- Word count (`word_count`) (faster than counting the result of `split`)
- Whether a string contains Thai or not (`contains_thai`)


Installation
------------

PyThai equires `thailib` to work. You can install it quite easily:

sudo apt-get install thailib

And then you can simply install `pythai` through **pip**:

pip install pythai

More
------------

Special thanks to Vee Satayamas for the original python bindings of libthai from C.

This library was written for use in [Gengo](http://www.gengo.com). It's free and open-source under the GNU lesser public license. Any contributions are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pythai-0.1.3.tar.gz (13.6 kB view details)

Uploaded Source

File details

Details for the file pythai-0.1.3.tar.gz.

File metadata

  • Download URL: pythai-0.1.3.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pythai-0.1.3.tar.gz
Algorithm Hash digest
SHA256 5694a793b4295287894bf83b1ceebe8cd137ca53087ce8bf47f02543a46d2911
MD5 fd4d3e134e8349ee6cdffe3a79a7e60c
BLAKE2b-256 f561416585a79955e3f6a61a72ea3c2ef6dc98f2d4e5549740a9d0b3c4abe791

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page