Python utilities for Myanmar language processing
Project description
PyMyaNLP
A Python library for Myanmar (Burmese) language processing and natural language processing tasks.
မြန်မာဗျည်းအက္ခရာအခြေခံသော ဘာသာစကားများအတွက် NLP အရာ၌ အထွေထွေသုံးစွဲနိုင်ရန်ရည်ရွယ်၍ ရေးသားဖြစ်ခဲ့သည်။
Installation
pip install pymyanlp
Quick Start
import pymyanlp
# Word segmentation
segments = pymyanlp.segment_word("ရေပုံမှန်သောက်ပါ") # ['ရေ', 'ပုံမှန်', 'သောက်', 'ပါ']
# Part-of-speech tagging
tags = pymyanlp.pos_tag("မြန်မာ ဘာသာ") # [('မြန်မာ', <PartOfSpeech.Noun: 'n'>), ('ဘာသာ', <PartOfSpeech.Noun: 'n'>)]
# Text detection and validation
pymyanlp.is_burmese("မြန်မာ ဘာသာ") # True
pymyanlp.contains_burmese("Hello မြန်မာ") # True
pymyanlp.get_burmese_script("မြန်မာ") # burmese
pymyanlp.get_burmese_script("တမၟာ") # mon
pymyanlp.get_burmese_script("ၡးခွ့မဲၢ်") # sgaw_karen
pymyanlp.get_burmese_script("ႁိူဝ်းမိၼ်") # shan
# Number transliteration
pymyanlp.transliterate_numbers("2024") # ၂၀၂၄
# Text processing pipeline
processed = pymyanlp.apply_written_suite("Hello 2024, မြန်မာ!") # Normalized text
Features
| Feature | Description |
|---|---|
| Word Segmentation | Multiple models (Viterbi, CRF-based) |
| POS Tagging | Part-of-speech tagging for Myanmar text |
| Text Normalization | Clean and standardize text |
| Transliteration | Convert English to Myanmar |
| Script Detection | Identify Burmese text and script variants |
| Punctuation Handling | Remove or process punctuation |
| Spacing Normalization | Handle mixed script spacing |
| Text Style Detection | Identify different Myanmar text styles |
| Sentiment Analysis* | Score-based sentiment classification |
| Grammar Analysis* | Myanmar particle and grammar detection |
| Spell Checking* | Basic spell checking functionality |
*means not yet implemented
Constants and Enums
# POS tag enums
pymyanlp.PartOfSpeech.Noun.value # "n"
# Built-in constants
pymyanlp.NUMBER_MAP # {'0': '၀', '1': '၁', ...}
pymyanlp.PUNCTUATION # ['။', '၊', ',', '.', ...]
Testing
Run the test suite:
# Run all tests
pytest tests/
Documentation
- API Reference: See module docstrings for detailed API documentation
- Test Examples: Check
tests/directory for usage examples
Project Structure
pymyanlp/
├── text/ # Text processing modules
├── analysis/ # Analysis and NLP modules
├── utils/ # Utility functions
├── resources/ # Language resources
└── lib/ # Core algorithms and models
License
MIT License - see LICENSE.txt for details.
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Run the test suite
- Submit a pull request
For bug reports and feature requests, please use the GitHub issue tracker.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pymyanlp-0.1.1.tar.gz.
File metadata
- Download URL: pymyanlp-0.1.1.tar.gz
- Upload date:
- Size: 18.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e15340ed997d0707b978de2997d98b9d95d97b046effa594c724a12982f60afa
|
|
| MD5 |
9627f95f390a63ef36d0b67ac403dc6b
|
|
| BLAKE2b-256 |
003094400ddd8295b5795f8d0d65f82323094ed65ba523ffbb83a79216e82787
|
File details
Details for the file pymyanlp-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pymyanlp-0.1.1-py3-none-any.whl
- Upload date:
- Size: 19.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f8852a7ecdd8ff86d5244463643bca99dcebb4b42dbc96817f8ccb190ae38990
|
|
| MD5 |
4a4710b070b989ce58a48d94a42f4825
|
|
| BLAKE2b-256 |
e8bf2b91d97934c2a306a5772ae6cb3087048101015e2ac7172cab566bd49515
|