Skip to main content

Burmese text normalizer, wordbreak, converter, cleaner and phonemizer for speech related tasks.

Project description

BURMESE PHONEMIZER AND CLEANER(BPC)

Python package Total alerts Language grade: Python

Burmese Language data prepartion for speech related tasks.

Installation

$ pip install bpc

or

$ pip install git+git://github.com:1chimaruGin/Burmese_Phomizer_and_Cleaner.git

Usage

For text Cleaning

from bpc import Cleaner

cc = Cleaner()
cc.clean_text("မင်္ဂလာပါ? မင်္ဂလာပါ။ ၀န်းရံ ဝ၁၂၃၄ 5B")

# output: မင်္ဂလာပါ မင်္ဂလာပါ။ ဝန်းရံ ၀၁၂၃၄ ၅ဘီ

For phonemization

from bpc import BurmesePhoneme

bp = BurmesePhonemizer()
bp.text_to_phone("မင်္ဂလာပါ")

# output: ['m', 'ŋ', 'ɡ', 'l', 't', 's', 'p', 'ˈe']

For data preparation

from bpc.dataset import PrepareDataset

dataset = PrepareDataset()
dataset.prepare_data(path='path/to/dataset', method='kfold', save=True)

References

Citations

@inproceedings{watanabe2018espnet,
  author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson {Enrique Yalta Soplin} and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
  title={{ESPnet}: End-to-End Speech Processing Toolkit},
  year={2018},
  booktitle={Proceedings of Interspeech},
  pages={2207--2211},
  doi={10.21437/Interspeech.2018-1456},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1456
}

@article{Bernard2021,
  doi = {10.21105/joss.03958},
  url = {https://doi.org/10.21105/joss.03958},
  year = {2021},
  publisher = {The Open Journal},
  volume = {6},
  number = {68},
  pages = {3958},
  author = {Mathieu Bernard and Hadrien Titeux},
  title = {Phonemizer: Text to Phones Transcription for Multiple Languages in Python},
  journal = {Journal of Open Source Software}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bpc-0.1.3.tar.gz (30.4 kB view details)

Uploaded Source

File details

Details for the file bpc-0.1.3.tar.gz.

File metadata

  • Download URL: bpc-0.1.3.tar.gz
  • Upload date:
  • Size: 30.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for bpc-0.1.3.tar.gz
Algorithm Hash digest
SHA256 c471eb7857cae2470f323db8631fa39918dea65922612655fe3c7b5d1934ee95
MD5 8bffac2547c90808ac90cdd722dd39fa
BLAKE2b-256 77824bce4703e9f7a497b7d7d5fe40879909079b32fdca72c727ff56eda74a38

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page