Python library for Myanmar language
Project description
Pyidaungsu
Python library for Myanmar language. Useful in Natural Language Processing and text preprocessing for Myanmar language.
Installation
pip install pyidaungsu
Usage
Zawgyi-Unicode detection
import pyidaungsu as pds
# font encoding detection
pds.detect("ထမင်းစားပြီးပြီလား")
>> "Unicode"
Zawgyi-Unicode conversion
# convert to zawgyi
pds.cvt2zgi("ထမင်းစားပြီးပြီလား")
>> "ထမင္းစားၿပီးၿပီလား"
# convert to unicode
pds.cvt2uni("ထမင္းစားၿပီးၿပီလား")
>> "ထမင်းစားပြီးပြီလား"
Syllabification
# syllabification
pds.syllabify("Alan TuringကိုArtificial Intelligenceနဲ့Computerတွေရဲ့ဖခင်ဆိုပြီးလူသိများပါတယ်")
>> ['Alan', 'Turing', 'ကို', 'Artificial', 'Intelligence', 'နဲ့', 'Computer', 'တွေ', 'ရဲ့', 'ဖ', 'ခင်', 'ဆို', 'ပြီး', 'လူ', 'သိ', 'များ', 'ပါ', 'တယ်']
Future work
- Add tokenizer
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyidaungsu-0.0.6.tar.gz
(31.3 kB
view hashes)