Skip to main content

Simple text to phones converter for multiple languages

Project description

Tests Linux MacOS Windows Codecov
Documentation Doc
Release GitHub release (latest SemVer) PyPI downloads
Citation status DOI

Phonemizer -- foʊnmaɪzɚ

  • The phonemizer allows simple phonemization of words and texts in many languages.

  • Provides both the phonemize command-line tool and the Python function phonemizer.phonemize. See the package's documentation.

  • It is based on four backends: espeak, espeak-mbrola, festival and segments. The backends have different properties and capabilities resumed in table below. The backend choice is let to the user.

    • espeak-ng is a Text-to-Speech software supporting a lot of languages and IPA (International Phonetic Alphabet) output.

    • espeak-ng-mbrola uses the SAMPA phonetic alphabet instead of IPA but does not preserve word boundaries.

    • festival is another Tex-to-Speech engine. Its phonemizer backend currently supports only American English. It uses a custom phoneset, but it allows tokenization at the syllable level.

    • segments is a Unicode tokenizer that build a phonemization from a grapheme to phoneme mapping provided as a file by the user.

    espeak espeak-mbrola festival segments
    phone set IPA SAMPA custom user defined
    supported languages 100+ 35 US English user defined
    processing speed fast slow very slow fast
    phone tokens :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
    syllable tokens :x: :x: :heavy_check_mark: :x:
    word tokens :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark:
    punctuation preservation :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark:
    stressed phones :heavy_check_mark: :x: :x: :x:
    tie :heavy_check_mark: :x: :x: :x:

Citation

To refenrece the phonemizer in your own work, please cite the following JOSS paper.

@article{Bernard2021,
  doi = {10.21105/joss.03958},
  url = {https://doi.org/10.21105/joss.03958},
  year = {2021},
  publisher = {The Open Journal},
  volume = {6},
  number = {68},
  pages = {3958},
  author = {Mathieu Bernard and Hadrien Titeux},
  title = {Phonemizer: Text to Phones Transcription for Multiple Languages in Python},
  journal = {Journal of Open Source Software}
}

Licence

Copyright 2015-2021 Mathieu Bernard

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pozalabs_phonemizer-3.3.0.tar.gz (45.2 kB view details)

Uploaded Source

Built Distribution

pozalabs_phonemizer-3.3.0-py3-none-any.whl (62.9 kB view details)

Uploaded Python 3

File details

Details for the file pozalabs_phonemizer-3.3.0.tar.gz.

File metadata

  • Download URL: pozalabs_phonemizer-3.3.0.tar.gz
  • Upload date:
  • Size: 45.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.0 CPython/3.11.9 Linux/6.5.0-1022-azure

File hashes

Hashes for pozalabs_phonemizer-3.3.0.tar.gz
Algorithm Hash digest
SHA256 1d567ab945c31675030526e8a462984bb0fb5528ef3e52a516230677388cd220
MD5 d2b5748e41d1e30e1cbed817e7039038
BLAKE2b-256 37e2a42549ce1d17d0bc21f32e0507f165025c3f2bd71a5b6b004f1ee672c1fc

See more details on using hashes here.

File details

Details for the file pozalabs_phonemizer-3.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pozalabs_phonemizer-3.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f2d4d68ddd7805a9b20986ee5446cfde49a2e8cfe2a9e4685b132652a07a0c99
MD5 d08976fa8514e7d24fff64d7f3b56683
BLAKE2b-256 3d58e0824e7f7c13d98e355baa87bbc60bb9e31115cfa713558c5f864ce0ef61

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page