No project description provided
Project description
Here's an extended version of your text with additional information about Burmese-to-Romanization:
Provides a tool for syllable-based tokenization of Burmese text. It breaks down Burmese text into individual syllables, facilitating language processing tasks such as text analysis, machine learning, and natural language processing (NLP) for Burmese.
Features
Syllable Tokenization: Tokenizes Burmese text into syllables based on Unicode rules. It helps in language segmentation and provides a clear framework for analyzing Burmese sentences in a structured manner.
Efficient Processing: Designed to handle large text efficiently with minimal memory overhead, making it scalable for tasks involving big data or large-scale text analysis.
Burmese Unicode Support: Fully supports Burmese script and syllable rules as defined by the Burmese Unicode standard, ensuring that the tokenization aligns with native Burmese text structure.
Burmese-to-Romanization: Converts Burmese script into its Romanized equivalent, facilitating pronunciation guidance and helping non-native speakers understand Burmese text. The Romanization process follows the standard linguistic rules for Burmese phonetic transcription, offering a bridge for users unfamiliar with the Burmese script to read, pronounce, and comprehend the language. This feature can be particularly useful for language learners, cross-lingual applications, and linguistic studies that require Romanized Burmese text.
How to use (Getting Started)
# Install the SimboloSiamese package using pip
# pip install SimboloSiamese
# Import the BurmeseConverter from the Siamese module
from Siamese import BurmeseConverter
converter = BurmeseConverter()
# Example: Zawgyi to Unicode
zawgyi_text = "ဖြွှော်"
try:
# Convert Zawgyi text to Unicode
unicode_output = converter.zawgyi_to_unicode(zawgyi_text)
# Print the Unicode output
print("Unicode Output:", unicode_output)
except Exception as e:
# Handle any errors that occur during conversion
print(f"Error in Zawgyi to Unicode conversion: {e}")
# Example: Tokenization of a Burmese word
tokenization_text = "တက္ကသိုလ်"
try:
# Tokenize the Burmese word. 1 means With the virama mark. If you dont want to tokenize the virama mark, you can type any numbers except 1
tokenized_output = converter.syllable_tokenization(1, tokenization_text) # try with process_text in case it cannot work with syllable_tokenization
print("Tokenized Output:", tokenized_output)
except Exception as e:
# Handle any errors that occur during tokenization
print(f"Cannot Tokenize the word: {e}")
# Example: Convert Burmese text to Romanized script
burmese_text = "ကော်"
try:
# Convert Burmese text to Romanized script
romanized_output = converter.burmese_to_romanization(burmese_text)
# Print the Romanized output
print("Romanized Output:", romanized_output)
except Exception as e:
# Handle any errors that occur during Romanization
print(f"Error in Burmese Romanization: {e}")
# Example: Romanization Burmese
burmese_text = "le kReAc: liuc:, K rI: sq a mHt, ၂ ၂ ၈ ၃, jQ, SeAF piu liu mRiu., lU ne rp kWk peAF jiu., pYk kY KL. pRI:, liuk pA lA jU, ၆ ၂, OO: s luN:, je SuN: KL. jQ // "
try:
burmese_output = converter.romanization_to_burmese(burmese_text)
print("Burmese Output:", burmese_output)
except Exception as e:
print(f"Error in Romanization Burmese: {e}")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file simbolosiamese-0.1.8.tar.gz
.
File metadata
- Download URL: simbolosiamese-0.1.8.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c843316b7a741ef2d69e10fee84dc84f7852cf53cd221400c79a89ae2a2661b |
|
MD5 | ed73692a470f5bf99115eea640f67a63 |
|
BLAKE2b-256 | 053d07d0617f7b5e6cdbe211bdfeda40c644ff1b414d48ddd84cf669f04317f5 |
File details
Details for the file SimboloSiamese-0.1.8-py3-none-any.whl
.
File metadata
- Download URL: SimboloSiamese-0.1.8-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5bb18b5366c03d5826dd8736b0925478e6e3110b95d6a70df8c3b4db8fe0d409 |
|
MD5 | a8bb26d5dfc40860591788468f5f17c7 |
|
BLAKE2b-256 | 6fea87efe76dbc91853c941db1d6cb66e8ba0705c34e20de0ac539f721527a6d |