Learning Using Texts - Thai Parser
Project description
lute3-thai
A Thai parser for Lute (lute3
) using the pythainlp
library.
Installation
See the Lute manual.
Usage
When this parser is installed, you can add "Thai" as a language to Lute, which comes with a simple story.
Notes
Thai is tough to parse! In particular, it is sometimes hard to know where sentences are split.
Some sentence splitting characters are specified in the Thai language definition, which you can edit.
This parser also assumes that spaces are used as sentence delimiters.
In many cases, this is a reasonable assumption (e.g. see the stories at Thai Reader), but in sometimes this can be incorrect. For example, numbers and English words are often written with spaces surrounding them, as in this single sentence from a news story:
ออกคำสั่งในวันเสาร์ที่ 2 พ.ย. 2567 ให้ทหาร 5,000 นาย กับ ตำรวจและเจ้าหน้าที่กองกำลังป้องกันพลเรือนอีก 5,000 นาย ไปเสริมกำลังเจ้าหน้าที่ในแคว้นบาเลนเซีย .
Hopefully in the future some smart codes will be able to improve the parsing to handle such situations ... but for now, Lute can give you some support for reading in Thai.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lute3_thai-0.0.3.tar.gz
.
File metadata
- Download URL: lute3_thai-0.0.3.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92edf2fa2b1e51d336614e565f6ca6c70b23045e5f55c0765f110fb60c83e6e1 |
|
MD5 | 916e0ddea95c85e6969413cb5cba84e2 |
|
BLAKE2b-256 | 4b300c399166be7327a497342d744e68d299cc0b6af0e3daca9ed964d1d8fbfb |
File details
Details for the file lute3_thai-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: lute3_thai-0.0.3-py3-none-any.whl
- Upload date:
- Size: 3.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5f4dd8fec1187e55675e179b1723ee4206144451765402cd84b8cae5524eaf6 |
|
MD5 | 8f3fe0e1781363d9e993487bc4e9f5c0 |
|
BLAKE2b-256 | 937533475b35afad5491b101a3e7ca1c3fb998d0c9ce711f82c0f56410ac5623 |