Skip to main content

Web app, command-line interface and Python library for synthesizing English texts into speech.

Project description

en-tts

PyPI PyPI Hugging Face 🤗 pytorch MIT PyPI PyPI PyPI DOI

Web app, command-line interface and Python library for synthesizing English texts into speech.

Installation

pip install en-tts --user

Usage as web app

Visit 🤗 Hugging Face for a live demo.

Screenshot Hugging Face

You can also run it locally be executing en-tts-web in CLI and opening your browser on http://127.0.0.1:7860.

Usage as CLI

en-tts-cli synthesize "When the sunlight strikes raindrops in the air, they act as a prism and form a rainbow."

The output can be listened here.

Usage as library

from pathlib import Path
from tempfile import gettempdir

from en_tts import Synthesizer, Transcriber, normalize_audio, save_audio

text = "When the sunlight strikes raindrops in the air, they act as a prism and form a rainbow."

transcriber = Transcriber()
synthesizer = Synthesizer()

text_ipa = transcriber.transcribe_to_ipa(text)
audio = synthesizer.synthesize(text_ipa)

tmp_dir = Path(gettempdir())
save_audio(audio, tmp_dir / "output.wav")

# Optional: normalize output
normalize_audio(tmp_dir / "output.wav", tmp_dir / "output_norm.wav")

Model info

The used TTS model is published here.

Evaluation results:

  • MOS naturalness: 3.55 ± 0.28 (GT: 4.17 ± 0.23)
  • MOS intelligibility: 4.44 ± 0.24 (GT: 4.63 ± 0.19)
  • Mean MCD-DTW: 29.15
  • Mean penalty: 0.1018

Phoneme set

  • Vowels: i, u, æ, ɑ, ɔ, ə, ɛ, ɪ, ʊ, ʌ
  • Diphthongs: aɪ, aʊ, eɪ, oʊ, ɔɪ
  • R-colored vowels: ɔr, ər, ɛr, ɪr, ʊr, ʌr
  • Consonants: b, d, dʒ, f, h, j, k, l, m, n, p, r, s, t, tʃ, v, w, z, ð, ŋ, ɡ, ʃ, θ
  • Breaks:
    • SIL0 (no break)
    • SIL1 (short break)
    • SIL2 (break)
    • SIL3 (long break)
  • Special characters: . ? ! , : ; - — " ' ( ) [ ]

Each vowel, diphthong, r-colored vowel and consonant can have one of these duration markers:

  • ˘ -> very short, e.g., oʊ˘
  • nothing -> normal, e.g., oʊ
  • ˑ -> half long, e.g., oʊˑ
  • ː -> long, e.g., oʊː

Furthermore, each vowel, diphthong and r-colored vowel can have a leading stress symbol attached:

  • ˈ -> primary stress, e.g., ˈoʊ
  • ˌ -> secondary stress, e.g., ˌoʊ
  • nothing -> no stress, e.g., oʊ

Stress and duration markers can be combined, e.g., ˌoʊː

Citation

If you want to cite this repo, you can use the BibTeX-entry generated by GitHub (see About => Cite this repository).

Acknowledgments

Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410

The authors gratefully acknowledge the GWK support for funding this project by providing computing time through the Center for Information Services and HPC (ZIH) at TU Dresden.

The authors are grateful to the Center for Information Services and High Performance Computing [Zentrum fur Informationsdienste und Hochleistungsrechnen (ZIH)] at TU Dresden for providing its facilities for high throughput calculations.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

en_tts-0.0.2.tar.gz (512.0 kB view details)

Uploaded Source

Built Distribution

en_tts-0.0.2-py3-none-any.whl (26.6 kB view details)

Uploaded Python 3

File details

Details for the file en_tts-0.0.2.tar.gz.

File metadata

  • Download URL: en_tts-0.0.2.tar.gz
  • Upload date:
  • Size: 512.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for en_tts-0.0.2.tar.gz
Algorithm Hash digest
SHA256 6487b2d7fc41249f5a5ab72ba74c4051a5f20157973918a2ff15819fd19285ac
MD5 710f72a867762dabdd45c911cf75aaf9
BLAKE2b-256 1fbe77e64915c75b52335242f6427bfc213f133ecd57ecee99038454d03c2e5a

See more details on using hashes here.

File details

Details for the file en_tts-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: en_tts-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 26.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for en_tts-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dc082decdc2f5f9c2aabf485802d6bdd059c87a09e1b501aa0f118f6b13328bb
MD5 3b82301ad824bf8cb3b75944f3555fb0
BLAKE2b-256 a1b626552015973d562655df43ef5494f08db95b70350531a76cba0775c98180

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page