A vocoder that can convert audio to Mel-Spectrogram and reverse with WaveGlow, all on GPU(if avaliable).
Project description
WaveGlow Vocoder
A vocoder that can convert audio to Mel-Spectrogram and reverse with WaveGlow, all on GPU(if avaliable).
Most code are extracted from Tacotron2 and WaveGlow of Nvidia.
Install
pip install waveglow-vocoder
Example
Performance
CPU(Intel i5):
GPU(GTX 1080Ti):
Usage
Load wav file as usual
import librosa
y,sr = librosa.load(librosa.util.example_audio_file(), sr=22050, mono=True, duration=10, offset=30)
y_tensor = torch.from_numpy(y).to(device='cuda', dtype=torch.float32)
Apply mel transform, this would be done on GPU if avaliable.
from waveglow_vocoder import WaveGlowVocoder
WV = WaveGlowVocoder()
mel = WV.wav2mel(y_tensor)
Decoder it with Waveglow.
NOTE:
As the parameter of pre-trained model is alignment with Tacotron2, one might get totally noise if the Mel spectrogram comes other function than wav2mel(an alias for TacotronSTFT.mel_spectrogram).
Support for librosa and torchaudio is under development.
wav = WV.mel2wav(mel)
Other pretrained model / Train with your own data
This vocoder will download pre-trained model from pytorch hub on the first time of initialize.
You can also download the latest model from WaveGlow, or with your own data and pass the path to the waveglow vocoder.
config_path = "your_config_of_model_training.json"
waveglow_path="your_model_path.pt"
WV = WaveGlowVocoder(waveglow_path=waveglow_path, config_path=config_path)
Then use it as usual.
TODO
- pip
- WaveRNN Vocoder
- MelGAN Vocoder
- examples
- performance
- support librosa Mel input
- CPU support
Reference
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for waveglow_vocoder-0.1.0-py3.7.egg
Algorithm | Hash digest | |
---|---|---|
SHA256 | c763150ab46772ace09214b1f17c3b8647989daa8835d53870e955532447fdfe |
|
MD5 | 2d4d86f4c8fd9ffb852cfb1f663988fd |
|
BLAKE2b-256 | 86140f901668f40ae4c175683290c28da12cb0af524a5463c6b2a7bf910210c9 |