Skip to main content

Python utilities for Myanmar language processing

Project description

PyMyaNLP

A Python library for Myanmar (Burmese) language processing and natural language processing tasks.

မြန်မာဗျည်းအက္ခရာအခြေခံသော ဘာသာစကားများအတွက် NLP အရာ၌ အထွေထွေသုံးစွဲနိုင်ရန်ရည်ရွယ်၍ ရေးသားဖြစ်ခဲ့သည်။

Installation

pip install pymyanlp

Quick Start

import pymyanlp

# Word segmentation
segments = pymyanlp.segment_word("ရေပုံမှန်သောက်ပါ") # ['ရေ', 'ပုံမှန်', 'သောက်', 'ပါ']

# Part-of-speech tagging
tags = pymyanlp.pos_tag("မြန်မာ ဘာသာ") # [('မြန်မာ', <PartOfSpeech.Noun: 'n'>), ('ဘာသာ', <PartOfSpeech.Noun: 'n'>)]

# Text detection and validation
pymyanlp.is_burmese("မြန်မာ ဘာသာ")  # True

pymyanlp.contains_burmese("Hello မြန်မာ")  # True

pymyanlp.get_burmese_script("မြန်မာ")  # burmese
pymyanlp.get_burmese_script("တမၟာ")  # mon
pymyanlp.get_burmese_script("ၡးခွ့မဲၢ်")  # sgaw_karen
pymyanlp.get_burmese_script("ႁိူဝ်းမိၼ်")  # shan

# Number transliteration
pymyanlp.transliterate_numbers("2024")  # ၂၀၂၄

# Text processing pipeline
processed = pymyanlp.apply_written_suite("Hello 2024, မြန်မာ!") # Normalized text

Features

Feature Description
Word Segmentation Multiple models (Viterbi, CRF-based)
POS Tagging Part-of-speech tagging for Myanmar text
Text Normalization Clean and standardize text
Transliteration Convert English to Myanmar
Script Detection Identify Burmese text and script variants
Punctuation Handling Remove or process punctuation
Spacing Normalization Handle mixed script spacing
Text Style Detection Identify different Myanmar text styles
Sentiment Analysis* Score-based sentiment classification
Grammar Analysis* Myanmar particle and grammar detection
Spell Checking* Basic spell checking functionality

*means not yet implemented

Constants and Enums

# POS tag enums
pymyanlp.PartOfSpeech.Noun.value  # "n"

# Built-in constants
pymyanlp.NUMBER_MAP  # {'0': '၀', '1': '၁', ...}
pymyanlp.PUNCTUATION  # ['။', '၊', ',', '.', ...]

Testing

Run the test suite:

# Run all tests
pytest tests/

Documentation

  • API Reference: See module docstrings for detailed API documentation
  • Test Examples: Check tests/ directory for usage examples

Project Structure

pymyanlp/
├── text/           # Text processing modules
├── analysis/       # Analysis and NLP modules
├── utils/          # Utility functions
├── resources/      # Language resources
└── lib/            # Core algorithms and models

License

MIT License - see LICENSE.txt for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Run the test suite
  5. Submit a pull request

For bug reports and feature requests, please use the GitHub issue tracker.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymyanlp-0.1.1.tar.gz (18.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pymyanlp-0.1.1-py3-none-any.whl (19.0 MB view details)

Uploaded Python 3

File details

Details for the file pymyanlp-0.1.1.tar.gz.

File metadata

  • Download URL: pymyanlp-0.1.1.tar.gz
  • Upload date:
  • Size: 18.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.11

File hashes

Hashes for pymyanlp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e15340ed997d0707b978de2997d98b9d95d97b046effa594c724a12982f60afa
MD5 9627f95f390a63ef36d0b67ac403dc6b
BLAKE2b-256 003094400ddd8295b5795f8d0d65f82323094ed65ba523ffbb83a79216e82787

See more details on using hashes here.

File details

Details for the file pymyanlp-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pymyanlp-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.11

File hashes

Hashes for pymyanlp-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f8852a7ecdd8ff86d5244463643bca99dcebb4b42dbc96817f8ccb190ae38990
MD5 4a4710b070b989ce58a48d94a42f4825
BLAKE2b-256 e8bf2b91d97934c2a306a5772ae6cb3087048101015e2ac7172cab566bd49515

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page