Web app, command-line interface and Python library for synthesizing Chinese texts into speech.
Project description
zho-tts
Web app, command-line interface and Python library for synthesizing Chinese texts into speech.
Installation
pip install zho-tts --user
Usage as web app
Visit 🤗 Hugging Face for a live demo.
You can also run it locally be executing zho-tts-web
in CLI and opening your browser on http://127.0.0.1:7860.
Usage as CLI
zho-tts-cli synthesize "长江 航务 管理局 和 长江 轮船 总公司 最近 决定 安排 一百三十三 艘 客轮 迎接 长江 干线 春运。"
The output can be listened here.
# Same example using IPA input
zho-tts-cli synthesize-ipa "ʈʂː|a˧˩˧˘|ŋ|tɕ˘|j|a˥˘|ŋ˘|SIL0|x|a˧˥˘|ŋ|u˥˩|SIL0|k|w|a˧˩˧|n|l˘|i˧˩˧|tɕː|y˧˥ˑ|SIL0|x|ɤ˧˥|SIL0|ʈʂː|a˧˩˧˘|ŋ|tɕ˘|j|a˥˘|ŋ|SIL0|l|w|ə˧˥|n|ʈʂʰ˘|w|a˧˥|n|SIL0|ts˘|ʊ˧˩˧|ŋ˘|kː|ʊ˥|ŋ|s|ɹ̩˥ˑ|SIL0|ts|w˘|ei̯˥˩|tɕ|i˥˩˘|n|SIL0|tɕ|ɥ|e˧˥|t|i˥˩|ŋ|SIL3|a˥|n|pʰ|ai̯˧˥|SIL0|i˥ˑ|p|ai̯˧˩˧|s|a˥˘|n|ʂ˘|ɻ̩˧˥|s|a˥|n|SIL0|s˘|ou̯˥|SIL0|kʰˑ|ɤ˥˩|lː|wˑ|ə˧˥ˑ|n|SIL0|i˧˥ː|ŋ|tɕ˘|j˘|e˥|SIL0|ʈʂː|a˧˩˧|ŋ|tɕ˘|j|a˥˘|ŋ|SIL0|k˘|a˥˩|n|ɕ|j˘|ɛ˥˩|n˘|SIL0|ʈʂʰˑ|w˘|ə˥˘|nː|y˥˩ˑ|nː|。"
The output can be listened here.
Usage as library
from pathlib import Path
from tempfile import gettempdir
from zho_tts import Synthesizer, Transcriber, normalize_audio, save_audio
text = "长江 航务 管理局 和 长江 轮船 总公司 最近 决定 安排 一百三十三 艘 客轮 迎接 长江 干线 春运。"
transcriber = Transcriber()
synthesizer = Synthesizer()
text_ipa = transcriber.transcribe_to_ipa(text)
audio = synthesizer.synthesize(text_ipa)
tmp_dir = Path(gettempdir())
save_audio(audio, tmp_dir / "output.wav")
# Optional: normalize output
normalize_audio(tmp_dir / "output.wav", tmp_dir / "output_norm.wav")
Model info
The used TTS model is published here.
Phoneme set
- Vowels: a ɛ e ə ɚ ɤ i o u ʊ y
- Diphthongs: ai̯ au̯ ei̯ ou̯
- Consonants: f j k kʰ l m n p pʰ ɹ̩¹ ɻ¹ ɻ̩¹ s t ts tsʰ tɕ tɕʰ tʰ w x ŋ ɕ ɥ ʂ ʈʂ ʈʂʰ
- Breaks:
- SIL0 (no break)
- SIL1 (short break)
- SIL2 (break)
- SIL3 (long break)
- special characters: 。 ?
Vowels and diphthongs contain one of these tones:
- ˥ (first tone)
- ˧˥ (second tone)
- ˧˩˧ (third tone)
- ˥˩ (fourth tone)
- (none)
¹ These consonants contain also tones.
Vowels, diphthongs and consonants contain one of these duration markers:
- ˘ -> very short, e.g., ou̯˘
- nothing -> normal, e.g., ou̯
- ˑ -> half long, e.g., ou̯ˑ
- ː -> long, e.g., ou̯ː
Tones and duration markers can be combined, e.g., ə˧˥ː
Speakers
Citation
If you want to cite this repo, you can use the BibTeX-entry generated by GitHub (see About => Cite this repository).
- Taubert, S. (2024). zho-tts (Version 0.0.2) [Computer software]. https://doi.org/10.5281/zenodo.11048515
Acknowledgments
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
The authors gratefully acknowledge the GWK support for funding this project by providing computing time through the Center for Information Services and HPC (ZIH) at TU Dresden.
The authors are grateful to the Center for Information Services and High Performance Computing [Zentrum fur Informationsdienste und Hochleistungsrechnen (ZIH)] at TU Dresden for providing its facilities for high throughput calculations.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file zho_tts-0.0.2.tar.gz
.
File metadata
- Download URL: zho_tts-0.0.2.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d307fc84d0d6634e6e4ce406a2f7a843cb27a8f397807d919c6295ee921b4ec |
|
MD5 | 9f967b335ab54ebd74687ce6a48c153f |
|
BLAKE2b-256 | 0cc31f84c3b66afca7bec93554961bfe069d6c82ae8b95b6e0e4cbcc9891052c |
File details
Details for the file zho_tts-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: zho_tts-0.0.2-py3-none-any.whl
- Upload date:
- Size: 56.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d144187f82bec5026b927c16dbb67c32759477083a480de9ce6a93906937e2a1 |
|
MD5 | 1fcf4e4c9c36b123414250c77b7f3124 |
|
BLAKE2b-256 | b9c68b09ae86b884d73dba0d73b89e327b1364c5f5baf4794bb59f529e0782bf |