Skip to main content

Basic nlp for thai

Project description

Word token to Pseudo Morpheme Segmentation

-ไม่ควรใช้งานกับประโยคภาษาไทยยาวๆ ควรตัดคำ หรือ ใช้งานรวมกับ TokenIdentification

Example code

from basicthainlp import PmSeg
ps = PmSeg()

textTest = 'รัฐราชการ'
data_list = ps.word2DataList(textTest)
print(data_list)
pred = ps.dataList2pmSeg(data_list)
print(list(textTest))
print(pred[0])
print(ps.pmSeg2List(list(textTest),pred[0]))
[['ร', 'Ccc'], ['ั', 'Vu'], ['ฐ', 'C'], ['ร', 'Ccc'], ['า', 'Vm'], ['ช', 'C'], ['ก', 'C'], ['า', 'Vm'], ['ร', 'Ccc']]
['ร', 'ั', 'ฐ', 'ร', 'า', 'ช', 'ก', 'า', 'ร']
['B', 'I', 'C', 'B', 'I', 'C', 'B', 'I', 'I']
['รัฐ', 'ราช', 'การ']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

basicthainlp-0.1.17.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

basicthainlp-0.1.17-py3-none-any.whl (2.9 MB view details)

Uploaded Python 3

File details

Details for the file basicthainlp-0.1.17.tar.gz.

File metadata

  • Download URL: basicthainlp-0.1.17.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.12

File hashes

Hashes for basicthainlp-0.1.17.tar.gz
Algorithm Hash digest
SHA256 c522cd401dcd69e72bf4b7c5cb0f7280d8ae26414b458ee8b49e12c69081e057
MD5 c3d1223324c9f3c5a1cafc33baed4bde
BLAKE2b-256 fe8334fece58daee2e513484eb8aaa47758ad9d97b241af12f7160f90a0f927a

See more details on using hashes here.

File details

Details for the file basicthainlp-0.1.17-py3-none-any.whl.

File metadata

File hashes

Hashes for basicthainlp-0.1.17-py3-none-any.whl
Algorithm Hash digest
SHA256 2552386c554262cfda5fd1a686d7dfb7ef937a2e2d8ac894e433b014416d8000
MD5 2141d27efd18c6f15878756db8375060
BLAKE2b-256 cd6b6352d58fbafa3f2484da153ca6790bf74c79588cbac44945461264a18321

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page