Skip to main content

Myanmar (Burmese) Syllable, Word, and Phrase

Project description

Myanmar Tokenizer

Syllable, word and phrase segmenter for Burmese (Myanmar language)

GitHub: https://github.com/ye-kyaw-thu/myWord

Install

pip install myword

Examples

from myword import SyllableTokenizer, WordTokenizer, PhraseTokenizer

syltok = SyllableTokenizer()
print(syltok.tokenize("မြန်မာနိုင်ငံ။"))
# ['မြန်', 'မာ', 'နိုင်', 'ငံ', '။']

wordtok = WordTokenizer()
print(wordtok.tokenize("မြန်မာနိုင်ငံ။"))
# ['မြန်မာ', 'နိုင်ငံ', '။']

phrtok = PhraseTokenizer()
print(phrtok.tokenize("မြန်မာနိုင်ငံသည် အရှေ့တောင်အာရှတွင် တည်ရှိသည်။"))
# ['မြန်မာ', 'နိုင်ငံ', 'သည်_အရှေ့တောင်', 'အာရှ', 'တွင်', 'တည်_ရှိ', 'သည်_။']

phrtok = PhraseTokenizer()
print(phrtok.tokenize("သူဟာလက်ဝှေ့ပွဲမှာအနိုင်ရနိုင်စရာရှိတယ်"))

phrtok = PhraseTokenizer(threshold = 0.1, minfreq = 3)
print(phrtok.tokenize("သူဟာလက်ဝှေ့ပွဲမှာအနိုင်ရနိုင်စရာရှိတယ်"))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

myword-0.0.2.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

myword-0.0.2-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file myword-0.0.2.tar.gz.

File metadata

  • Download URL: myword-0.0.2.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for myword-0.0.2.tar.gz
Algorithm Hash digest
SHA256 2334d3acbec0571751125224ab75006a8522f3c5447e4eed9a9349ca94c10214
MD5 56afac60d9d8478781f420884a9ac27a
BLAKE2b-256 bd1b776dcffcb18b18c1be707f605ffe9dc924edde11e02b82da780edbaa7e95

See more details on using hashes here.

File details

Details for the file myword-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: myword-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for myword-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 41b2da95420e66970b33c6bb8a7e67528f23eccf569fe7dfff0139b924d2761c
MD5 189b25c124de962917b931bef33e4838
BLAKE2b-256 67aed58f5b3b6d75be97cd256983a375908d0f6f3c6b5d77d757a8548d133f5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page