Skip to main content

A Free, Standalone and Open-Source Khmer Grapheme-to-Phonemes.

Project description

Khmer Phonemizer

A Free, Standalone and Open-Source Khmer Grapheme-to-Phonemes.

[Colab]

Installation

pip install khmerphonemizer

Usage

from khmerphonemizer import phonemize

text = "នៅលើលោកនេះមិនមានមនុស្សណាម្នាក់ចេះអស់ទេ"
result = phonemize(text)

print(result)

Output

(['នៅលើ',
  'លោក',
  'នេះ',
  'មិន',
  'មាន',
  'មនុស្ស',
  'ណា',
  'ម្នាក់',
  'ចេះ',
  'អស់',
  'ទេ'],
 [['n', 'ɨ', 'w', 'l', 'əː'],
  ['l', 'oː', 'k'],
  ['n', 'i', 'h'],
  ['m', 'ɨ', 'n'],
  ['m', 'i', 'ə', 'n'],
  ['m', 'ɔ', 'n', 'u', 'h'],
  ['n', 'aː'],
  ['m', 'n', 'ĕ', 'ə', 'ʔ'],
  ['c', 'e', 'h'],
  ['ʔ', 'ɑ', 'h'],
  ['t', 'eː']])

Check out the examples/ for more examples.

API

  • phonemize Tokenize input text into words and phonemize each word and returns a tuple with tokens and phonemes.

    • input_str: str Text with multiple words.
    • beam: int = 500 number of beam search.
    • min_beam: int = 100: minimum number of beam search.
    • beam_score: float = 0.6 beam search score.
    • use_lexicon: bool = True Use lexicon dictionary for known words.
  • phonemize_single Phonemize a single word.

    • word: str Text with single Khmer or English word only.
    • beam: int = 500 number of beam search.
    • min_beam: int = 100: minimum number of beam search.
    • beam_score: float = 0.6 beam search score.
    • use_lexicon: bool = True Use lexicon dictionary for known words.

License

MIT


References

Without these awesome projects from awesome people, this wouldn't be possible.

Related

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

khmerphonemizer-0.0.3.tar.gz (11.1 MB view hashes)

Uploaded Source

Built Distribution

khmerphonemizer-0.0.3-py3-none-any.whl (11.2 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page