Skip to main content

Burmese text normalizer, wordbreak, converter, cleaner and phonemizer for speech related tasks.

Project description

BURMESE PHONEMIZER AND CLEANER(BPC)

Python package Total alerts Language grade: Python

Burmese Language data prepartion for speech related tasks.

Installation

$ pip install bpc

or

$ pip install git+git://github.com:1chimaruGin/Burmese_Phomizer_and_Cleaner.git

Usage

For text Cleaning

from bpc import Cleaner

cc = Cleaner()
cc.clean_text("မင်္ဂလာပါ? မင်္ဂလာပါ။ ၀န်းရံ ဝ၁၂၃၄ 5B")

# output: မင်္ဂလာပါ မင်္ဂလာပါ ၀န်းရံ ဝ၁၂၃၄ 5B

For phonemization

from bpc import BurmesePhoneme

bp = BurmesePhonemizer()
bp.text_to_phone("မင်္ဂလာပါ")

# output: ['m', 'ŋ', 'ɡ', 'l', 't', 's', 'p', 'ˈe']

For data preparation

from bpc.dataset import PrepareDataset

dataset = PrepareDataset()
dataset.prepare_data(path='path/to/dataset', method='kfold', save=True)

References

Citations

@inproceedings{watanabe2018espnet,
  author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson {Enrique Yalta Soplin} and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
  title={{ESPnet}: End-to-End Speech Processing Toolkit},
  year={2018},
  booktitle={Proceedings of Interspeech},
  pages={2207--2211},
  doi={10.21437/Interspeech.2018-1456},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1456
}

@article{Bernard2021,
  doi = {10.21105/joss.03958},
  url = {https://doi.org/10.21105/joss.03958},
  year = {2021},
  publisher = {The Open Journal},
  volume = {6},
  number = {68},
  pages = {3958},
  author = {Mathieu Bernard and Hadrien Titeux},
  title = {Phonemizer: Text to Phones Transcription for Multiple Languages in Python},
  journal = {Journal of Open Source Software}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bpc-0.1.2.tar.gz (29.8 kB view details)

Uploaded Source

File details

Details for the file bpc-0.1.2.tar.gz.

File metadata

  • Download URL: bpc-0.1.2.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for bpc-0.1.2.tar.gz
Algorithm Hash digest
SHA256 3656481702abe70b46ca549356cdb2261383969e6eeea9161b6df13a1a2fb14a
MD5 7982949004616fdad585b8b34ce41d0c
BLAKE2b-256 1e88ebe224459d5807b6cb0d7c2e72b98f8e26f06af35c0b24166e93b0fbfea0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page