Skip to main content

A demo of zh/Chinese Text to Speech system run on CPU

Project description



A demo of zh/Chinese Text to Speech system run on CPU in real time. (fastspeech2 + mbmelgan)

RTF(real time factor): 0.2 with cpu: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz 24khz audio use fastspeech2, RTF1.6 for tacotron2

This repo is mainly based on TensorFlowTTS with little improvement.

demo wav

text = "2020年,这是一个开源的端到端中文语音合成系统"


pip install zhtts

or clone this repo, then pip install .


import zhtts

text = "2020年,这是一个开源的端到端中文语音合成系统"
tts = zhtts.TTS() # use fastspeech2 by default

tts.text2wav(text, "demo.wav")
>>> Save wav to demo.wav

>>> ('二零二零年,这是一个开源的端到端中文语音合成系统', 'sil ^ er4 #0 l ing2 #0 ^ er4 #0 l ing2 #0 n ian2 #0 #3 zh e4 #0 sh iii4 #0 ^ i2 #0 g e4 #0 k ai1 #0 ^ van2 #0 d e5 #0 d uan1 #0 d ao4 #0 d uan1 #0 zh ong1 #0 ^ uen2 #0 ^ v3 #0 ^ in1 #0 h e2 #0 ch eng2 #0 x i4 #0 t ong3 sil')

>>> array([0., 0., 0., ..., 0., 0., 0.], dtype=float32)

web api demo

clone this repo, pip install flask first, then

$ curl -o "helloworld.wav" "http://localhost:5000/api/tts?text=%E4%BD%A0%E5%A5%BD%E4%B8%96%E7%95%8C"

%E4%BD%A0%E5%A5%BD%E4%B8%96%E7%95%8C is url code of"你好,世界!"

Use tacotron2 instead of fastspeech2

wav generate from tacotron model is better than fast speech, however tacotron is much slower , to use Tacotron, change code

import zhtts
tts = zhtts.TTS(text2mel_name="TACOTRON")
# tts = zhtts.TTS(text2mel_name="FASTSPEECH2")

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zhtts-0.0.1.tar.gz (40.1 MB view hashes)

Uploaded Source

Built Distribution

zhtts-0.0.1-py3-none-any.whl (40.1 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page