Skip to main content

Python package for text normalization, use for frontend of Text-to-speech Reseach

Project description

Install ViNorm package

pip install vinorm

Using in python script

from vinorm import TTSnorm
S=TTSnorm("Hàm này được phát triển từ 8/2019. Có phải tháng 12/2020 đã có vaccine phòng ngừa Covid-19 xmz ?")

Some option

TTSnorm(text, punc = False, unknown = True, lower = True, rule = False )
  • lower: If true, get normalization with lowercase
  • rule: If true, just get normalization wit Regex, not using Dictionary Checking (this flag is not used with another flag)
  • punc: If true, do not replace punctuation with dot and coma
  • unknown: If true, replace unknown word, discard word undefine and do not contain vowel, do not spell word with vowel

From version 2.0, do not replace unknown words, skip them for espeak handle in phonetization step

  • This version does not parse case: "Tổ chức WTO" WTO do not in dictionary -> unknow -> keep origin, do not spell as in version 1.0, this aim to use with espeak, let espeak handle, but the drawback is the output of espeak for this case is "ve1kɛɜpte1ɔ7", it does not split each syllable.
  • For new entity, need to update in the dictionary

For update lastest version access: https://github.com/NoahDrisort/vinorm

For version 1.0: spell words that is unknow by each character, check previous commit

For C++ version: https://github.com/NoahDrisort/vinorm_cpp_version

Update pypi

python setup.py sdist bdist_wheel
twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

regnorm-0.0.4.tar.gz (40.2 MB view details)

Uploaded Source

Built Distribution

regnorm-0.0.4-py3-none-any.whl (80.7 MB view details)

Uploaded Python 3

File details

Details for the file regnorm-0.0.4.tar.gz.

File metadata

  • Download URL: regnorm-0.0.4.tar.gz
  • Upload date:
  • Size: 40.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.11

File hashes

Hashes for regnorm-0.0.4.tar.gz
Algorithm Hash digest
SHA256 3c2c5e1a729351c0920b897702c0cf615b6aedd3e3ee5c5ccd2c7b1cd96b12d5
MD5 2e061f40edac0cc65595d51a1c9569fe
BLAKE2b-256 a52e8e557927c735643867c235c60069b270d41ccc02c90da20df90b07c79f09

See more details on using hashes here.

File details

Details for the file regnorm-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: regnorm-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 80.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.11

File hashes

Hashes for regnorm-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a94f1ac51d58f86d0f9198815ee128eac16252713941db53158afaba17441b82
MD5 2eddc10a99e3b973640d1673b8f7971c
BLAKE2b-256 47b2cba1e94cb96a65896ced735ce9e77efa80b3e5183aa13369a8302f7349f9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page