A Free, Standalone and Open-Source Khmer Grapheme-to-Phonemes.
Project description
Khmer Phonemizer
A Free, Standalone and Open-Source Khmer Grapheme-to-Phonemes.
Installation
pip install khmerphonemizer
Usage
from khmerphonemizer import phonemize
text = "នៅលើលោកនេះមិនមានមនុស្សណាម្នាក់ចេះអស់ទេ"
result = phonemize(text)
print(result)
Output
(['នៅលើ',
'លោក',
'នេះ',
'មិន',
'មាន',
'មនុស្ស',
'ណា',
'ម្នាក់',
'ចេះ',
'អស់',
'ទេ'],
[['n', 'ɨ', 'w', 'l', 'əː'],
['l', 'oː', 'k'],
['n', 'i', 'h'],
['m', 'ɨ', 'n'],
['m', 'i', 'ə', 'n'],
['m', 'ɔ', 'n', 'u', 'h'],
['n', 'aː'],
['m', 'n', 'ĕ', 'ə', 'ʔ'],
['c', 'e', 'h'],
['ʔ', 'ɑ', 'h'],
['t', 'eː']])
Check out the examples/ for more examples.
API
-
phonemize
Tokenize input text into words and phonemize each word and returns a tuple with tokens and phonemes.input_str: str
Text with multiple words.beam: int = 500
number of beam search.min_beam: int = 100
: minimum number of beam search.beam_score: float = 0.6
beam search score.use_lexicon: bool = True
Use lexicon dictionary for known words.
-
phonemize_single
Phonemize a single word.word: str
Text with single Khmer or English word only.beam: int = 500
number of beam search.min_beam: int = 100
: minimum number of beam search.beam_score: float = 0.6
beam search score.use_lexicon: bool = True
Use lexicon dictionary for known words.
License
MIT
References
Without these awesome projects from awesome people, this wouldn't be possible.
- Khmer Word Search: Challenges, Solutions, and Semantic-Aware Search (Rina Buoy and Nguonly Taing and Sovisal Chenda)
- CUNY-CL/wikipron (Kyle Gorman, Jackson Lee, and contributors, 2019)
- rhasspy/gruut (Michael Hansen et al., 2020)
- OpenFst (Kyle Gorman et al.)
- AdolfVonKleist/Phonetisaurus (Josef Novak et al., 2017)
Related
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
khmerphonemizer-0.0.3.tar.gz
(11.1 MB
view details)
Built Distribution
File details
Details for the file khmerphonemizer-0.0.3.tar.gz
.
File metadata
- Download URL: khmerphonemizer-0.0.3.tar.gz
- Upload date:
- Size: 11.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b0a5b1ec4e7d78986dde7960085ca0101c4830c945bb6bece6807dea1d6a9e9 |
|
MD5 | ab5184988a8ed1141935bd09ea2acef1 |
|
BLAKE2b-256 | be006a46e4e0a9ceca4aa8701f8fdf97f1d35a18a14b5e76b557150c832fa3b8 |
File details
Details for the file khmerphonemizer-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: khmerphonemizer-0.0.3-py3-none-any.whl
- Upload date:
- Size: 11.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82441879e01c9d52671f806725b0ba53770bfaef9e7352df44d3efc01eda3588 |
|
MD5 | 6c54a0af8e4ac7c80c604a1489170448 |
|
BLAKE2b-256 | 178cab921d5733033871952c569f33f80b366190ce4ced158240ac91c8fc6abe |