Skip to main content

Myanmar (Burmese) Syllable, Word, and Phrase

Project description

Myanmar Tokenizer

Syllable, word and phrase segmenter for Burmese (Myanmar language)

GitHub: https://github.com/ye-kyaw-thu/myWord

Install

pip install myword

Examples

from myword import SyllableTokenizer, WordTokenizer, PhraseTokenizer

syltok = SyllableTokenizer()
print(syltok.tokenize("မြန်မာနိုင်ငံ။"))
# ['မြန်', 'မာ', 'နိုင်', 'ငံ', '။']

wordtok = WordTokenizer()
print(wordtok.tokenize("မြန်မာနိုင်ငံ။"))
# ['မြန်မာ', 'နိုင်ငံ', '။']

phrtok = PhraseTokenizer()
print(phrtok.tokenize("မြန်မာနိုင်ငံသည် အရှေ့တောင်အာရှတွင် တည်ရှိသည်။"))
# ['မြန်မာ', 'နိုင်ငံ', 'သည်_အရှေ့တောင်', 'အာရှ', 'တွင်', 'တည်_ရှိ', 'သည်_။']

phrtok = PhraseTokenizer()
print(phrtok.tokenize("သူဟာလက်ဝှေ့ပွဲမှာအနိုင်ရနိုင်စရာရှိတယ်"))

phrtok = PhraseTokenizer(threshold = 0.1, minfreq = 3)
print(phrtok.tokenize("သူဟာလက်ဝှေ့ပွဲမှာအနိုင်ရနိုင်စရာရှိတယ်"))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

myword-0.0.1.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

myword-0.0.1-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file myword-0.0.1.tar.gz.

File metadata

  • Download URL: myword-0.0.1.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for myword-0.0.1.tar.gz
Algorithm Hash digest
SHA256 5d3e1df9e78c6d4a72c19c9b72fe67a75d72250f72e0c824def7c79739b96f03
MD5 916682da2af3bb9b3cc99b14730c2661
BLAKE2b-256 7275593998a58399d813f74a088a5e86942fd81e666931709c0367843412ec03

See more details on using hashes here.

File details

Details for the file myword-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: myword-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for myword-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a7415d08132e46446177b1c6eca1fc1a72fd747fea55665288890cb15998cc28
MD5 e194e2cb06136fd1fa1db2ce537cce87
BLAKE2b-256 ebc3c0ece21419c9d757073dabe67597929ff0d1bf46c27cfc99f99d9894890d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page