Skip to main content

A vocoder that can convert audio to Mel-Spectrogram and reverse with WaveGlow, all on GPU(if avaliable).

Project description

WaveGlow Vocoder

A vocoder that can convert audio to Mel-Spectrogram and reverse with WaveGlow, all on GPU.
Most code is from Tacotron2 and WaveGlow.

Install

pip install waveglow-vocoder

Example

Original Vocoded
img img
original music vocoded music
img img
original speech vocoded speech

Usage

Load wav file as torch tensor on GPU.

import librosa

y,sr = librosa.load(librosa.util.example_audio_file(), sr=22050, mono=True, duration=10, offset=30)
y_tensor = torch.from_numpy(y).to(device='cuda', dtype=torch.float32)

Apply mel transform, this would be done on GPU(if ava).

from waveglow_vocoder import WaveGlowVocoder

WV = WaveGlowVocoder()
mel = WV.wav2mel(y_tensor)

Decoder it with Waveglow.

NOTE:
As the hyperparameter of pre-trained model is alignment with Tacotron2, one might get totally noise if the Mel spectrogram comes from other function than wav2mel(an alias for TacotronSTFT.mel_spectrogram).
Support for the melspectrogram from librosa and torchaudio is under development.

wav = WV.mel2wav(mel)

Other pretrained model / Train with your own data

This vocoder will download pre-trained model from pytorch hub on the first time of initialize.
You can also download the latest model from WaveGlow, or with your own data and pass the path to the waveglow vocoder.

config_path = "your_config_of_model_training.json"
waveglow_path="your_model_path.pt"
WV = WaveGlowVocoder(waveglow_path=waveglow_path, config_path=config_path)

Then use it as usual.

TODO

  • WaveRNN Vocoder
  • MelGAN Vocoder
  • Performance
  • Support librosa Mel input

Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

waveglow_vocoder-0.2.1.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

waveglow_vocoder-0.2.1-py3.7.egg (23.5 kB view details)

Uploaded Egg

File details

Details for the file waveglow_vocoder-0.2.1.tar.gz.

File metadata

  • Download URL: waveglow_vocoder-0.2.1.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for waveglow_vocoder-0.2.1.tar.gz
Algorithm Hash digest
SHA256 1c0f13f0eefe0c0c13c05ceb59c5838bd1ebad81aa6b2363e6f1562d05f6efe3
MD5 a7f864edf9f52a7a56b74b7ff540d615
BLAKE2b-256 4dfebc0f9e3f8152b0ee85c2ee39c297321aed9fad74740497001be569318284

See more details on using hashes here.

File details

Details for the file waveglow_vocoder-0.2.1-py3.7.egg.

File metadata

  • Download URL: waveglow_vocoder-0.2.1-py3.7.egg
  • Upload date:
  • Size: 23.5 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for waveglow_vocoder-0.2.1-py3.7.egg
Algorithm Hash digest
SHA256 6d2b73be85174fcc95b390453abab9106b78c034cb4b1b6ab010ca18434c6ee0
MD5 d10d2f5fbb7e61bcc0c6e5e0e9b9c7d3
BLAKE2b-256 127515441b9ec9c97e63b8a1cc050ee362397720edeef5b1f0aa30cdc87cd946

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page