Skip to main content

Ready-to-use Multilingual Text-To-Speech (TTS) package.

Project description

drawing

PyPI version GitHub Issues Contributions welcome License: MIT

EasyTTS is an open-source and ready-to-use Multilingual Text-To-Speech (TTS) package.

The goal is to simplify usages of state-of-the-art text-to-speech models for a variety of languages (french, english, ...).

⚠️ EasyTTS is currently in beta. ⚠️

Quick installation

EasyTTS is constantly evolving. New features, tutorials, and documentation will appear over time. EasyTTS can be installed via PyPI to rapidly use the standard library. Moreover, a local installation can be used by those users than want to run experiments and modify/customize the toolkit. EasyTTS supports both CPU and GPU computations. Please note that CUDA must be properly installed to use GPUs.

Anaconda setup

conda create --name EasyTTS python=3.7 -y
conda activate EasyTTS
pip install git+https://github.com/repodiac/german_transliterate

More information on managing environments with Anaconda can be found in the conda cheat sheet.

Install via PyPI

Once you have created your Python environment (Python 3.7+) you can simply type:

pip install EasyTTS
pip install git+https://github.com/repodiac/german_transliterate

Install with GitHub

Once you have created your Python environment (Python 3.7+) you can simply type:

git clone https://github.com/qanastek/EasyTTS.git
cd EasyTTS
pip install -r requirements.txt
pip install --editable .

Any modification made to the EasyTTS package will be automatically interpreted as we installed it with the --editable flag.

Example Usage

import soundfile as sf
from EasyTTS.inference.TTS import TTS

tts = TTS(lang="fr") # Instantiate the model for your language
audio = tts.predict(text="Bonjour à tous") # Make a prediction

sf.write('./audio_pip.wav', audio, 22050, "PCM_16") # Save output in .WAV file

Audios Samples

Sentence Language Audio File
Comme le capitaine prononçait ces mots, un éclair illumina les ondes de l'Atlantique, puis une détonation se fit entendre et deux boulets ramés balayèrent le pont de l'Alcyon. FR audio_fr.wav
We shall not flag or fail. We shall go on to the end... we shall never surrender. EN audio_en.wav

Model architectures

  1. Tacotron 2 (from Google Research & University of California, Berkeley) released with the paper NATURAL TTS SYNTHESIS BY CONDITIONING WAVENET ON MEL SPECTROGRAM PREDICTIONS, by Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, RJ Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis and Yonghui Wu.

Datasets used

  1. SynPaFlex (from IRISA, LLF (Laboratoire de Linguistique Formelle de Nantes), LIUM (Le Mans Université) and ATILF (Analyse et Traitement Informatique de la Langue Française)) released with the paper SynPaFlex-Corpus: An Expressive French Audiobooks Corpus Dedicated to Expressive Speech Synthesis, by Aghilas Sini, Damien Lolive, Gaëlle Vidal, Marie Tahon and Élisabeth Delais-Roussarie.

Build PyPi package

Build: python setup.py sdist bdist_wheel

Upload: twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

EasyTTS-0.3.2.tar.gz (5.2 kB view hashes)

Uploaded source

Built Distribution

EasyTTS-0.3.2-py3-none-any.whl (6.6 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page