Simple text to phones converter for multiple languages
Project description
Tests | |
---|---|
Documentation | |
Release | |
Citation |
Phonemizer -- foʊnmaɪzɚ
-
The phonemizer allows simple phonemization of words and texts in many languages.
-
Provides both the
phonemize
command-line tool and the Python functionphonemizer.phonemize
. See the package's documentation. -
It is based on four backends: espeak, espeak-mbrola, festival and segments. The backends have different properties and capabilities resumed in table below. The backend choice is let to the user.
-
espeak-ng is a Text-to-Speech software supporting a lot of languages and IPA (International Phonetic Alphabet) output.
-
espeak-ng-mbrola uses the SAMPA phonetic alphabet instead of IPA but does not preserve word boundaries.
-
festival is another Tex-to-Speech engine. Its phonemizer backend currently supports only American English. It uses a custom phoneset, but it allows tokenization at the syllable level.
-
segments is a Unicode tokenizer that build a phonemization from a grapheme to phoneme mapping provided as a file by the user.
espeak espeak-mbrola festival segments phone set IPA SAMPA custom user defined supported languages 100+ 35 US English user defined processing speed fast slow very slow fast phone tokens :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: syllable tokens :x: :x: :heavy_check_mark: :x: word tokens :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: punctuation preservation :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: stressed phones :heavy_check_mark: :x: :x: :x: tie :heavy_check_mark: :x: :x: :x: -
Citation
To refenrece the phonemizer
in your own work, please cite the following JOSS
paper.
@article{Bernard2021,
doi = {10.21105/joss.03958},
url = {https://doi.org/10.21105/joss.03958},
year = {2021},
publisher = {The Open Journal},
volume = {6},
number = {68},
pages = {3958},
author = {Mathieu Bernard and Hadrien Titeux},
title = {Phonemizer: Text to Phones Transcription for Multiple Languages in Python},
journal = {Journal of Open Source Software}
}
Licence
Copyright 2015-2021 Mathieu Bernard
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pozalabs_phonemizer-3.3.0.tar.gz
.
File metadata
- Download URL: pozalabs_phonemizer-3.3.0.tar.gz
- Upload date:
- Size: 45.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.0 CPython/3.11.9 Linux/6.5.0-1022-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d567ab945c31675030526e8a462984bb0fb5528ef3e52a516230677388cd220 |
|
MD5 | d2b5748e41d1e30e1cbed817e7039038 |
|
BLAKE2b-256 | 37e2a42549ce1d17d0bc21f32e0507f165025c3f2bd71a5b6b004f1ee672c1fc |
File details
Details for the file pozalabs_phonemizer-3.3.0-py3-none-any.whl
.
File metadata
- Download URL: pozalabs_phonemizer-3.3.0-py3-none-any.whl
- Upload date:
- Size: 62.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.0 CPython/3.11.9 Linux/6.5.0-1022-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2d4d68ddd7805a9b20986ee5446cfde49a2e8cfe2a9e4685b132652a07a0c99 |
|
MD5 | d08976fa8514e7d24fff64d7f3b56683 |
|
BLAKE2b-256 | 3d58e0824e7f7c13d98e355baa87bbc60bb9e31115cfa713558c5f864ce0ef61 |