Parallel WaveGAN implementation
Project description
Parallel WaveGAN implementation with Pytorch
This repository provides UNOFFICIAL Parallel WaveGAN implementation with Pytorch.
The goal of this repository is to provide the real-time neural vocoder which is compatible with ESPnet-TTS.
Source of the figure: https://arxiv.org/pdf/1910.11480.pdf
Requirements
This repository is tested on Ubuntu 16.04 with a GPU Titan V.
- Python 3.6+
- Cuda 10.0
- CuDNN 7+
All of the codes are tested on Pytorch 1.0.1, 1.1, 1.2, and 1.3.
Setup
You can select the installation method from two alternatives.
A. Use pip
$ git clone https://github.com/kan-bayashi/ParallelWaveGAN.git
$ cd ParallelWaveGAN
$ pip install -e .
B. Make virtualenv
$ git clone https://github.com/kan-bayashi/ParallelWaveGAN.git
$ cd ParallelWaveGAN/tools
$ make
$ source venv/bin/activate
Run
This repository provides Kaldi-style recipes, as the same as ESPnet.
Currently, three recipes are supported.
- CMU Arctic: English speakers
- LJSpeech: English female speaker
- JSUT: Japanese female speaker
- CSMSC: Mandarin female speaker
To run the recipe, please follow the below instruction.
# Let us move on the recipe directory
$ cd egs/ljspeech/voc1
# Run the recipe from scratch
$ ./run.sh
# You can select the stage to start and stop
$ ./run.sh --stage 2 --stop_stage 2
All of the hyperparameters is written in a single yaml format configuration file.
Please check this example in ljspeech recipe.
The training requires ~3 days with a single GPU (TITAN V).
The speed of the training is 0.5 seconds per an iteration, in total ~ 200000 sec (= 2.31 days).
You can monitor the training progress via tensorboard.
$ tensorboard --logdir exp
The decoding speed is RTF = 0.015 with TITAN V, much faster than the real-time.
[decode]: 100%|██████████| 250/250 [00:30<00:00, 8.31it/s, RTF=0.0156]
2019-11-03 09:07:40,480 (decode:127) INFO: finished generation of 250 utterances (RTF = 0.016).
Results
You can listen to the samples and download pretrained models at our google drive.
The training is still on going. Please check the latest progress at https://github.com/kan-bayashi/ParallelWaveGAN/issues/1.
References
Acknowledgement
The author would like to thank Ryuichi Yamamoto (@r9y9) for his great repository, paper and valuable discussions.
Author
Tomoki Hayashi (@kan-bayashi)
E-mail: hayashi.tomoki<at>g.sp.m.is.nagoya-u.ac.jp
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for parallel_wavegan-0.2.1.post3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29f21bebe78f1bc1782818306dfdc6e182bdee6a178b57de6a18761961798432 |
|
MD5 | 03d9946e8dbddcc73ce006ac1a3ee51a |
|
BLAKE2b-256 | bb13fd16ce23b8e17d784e939ad8dae8fe16e17f10bf191ba22f3e3083c1a089 |