Pytorch implementation of neural homomorphic vocoder
Project description
neural-homomorphic-vocoder
A neural vocoder based on source-filter model called neural homomorphic vocoder
Install
$ cd tools
$ make
Usage
Usage for NeuralHomomorphicVocoder class
- Input
- x: mel-filterbank
- cf0: continuous f0
- uv: u/v symbol
import torch
from nhv import NeuralHomomorphicVocoder
net = NeuralHomomorphicVocoder(
fs=24000, # sampling frequency
fft_size=1024, # size for impuluse responce of LTV
hop_size=256, # hop size in each mel-filterbank frame
in_channels=80, # input channels (i.e., dimension of mel-filterbank)
conv_channels=256, # channel size of LTV filter
ltv_out_channels=222, # output size of LTV filter
kernel_size=3, # kernel size of LTV filter
group_size=8, # group size of LTV filter
dilation_size=1, # dilation size of LTV filter
fmin=80, # min freq. of melspc calculation
fmax=7600, # max freq. of melspc calculation
roll_size=24, # roll size to calculate logspc from melspc
use_causal=False, # use causal conv LTV filter
use_conv_postfilter=False, # use causal conv postfilter for NHV output
use_ltv_conv_postfilter=False, # use causal conv postfilter for LTV output
use_reference_mag=False, # use reference logspc calculated from melspc
use_quefrency_norm=True, # enable ccep normalized by quefrency index
scaler_file=None # internal scaling of melspc
# (Dict -> key="mlfb" = StandardScaler)
)
B, T, D = 3, 100, in_channels # batch_size, frame_size, n_mels
z = torch.randn(B, 1, T * hop_size)
x = torch.randn(B, T, D)
cf0 = torch.randn(B, T, 1)
uv = torch.randn(B, T, 1)
y = net(z, torch.cat([x, cf0, uv], dim=-1)) # z: (B, 1, T * hop_size), c: (B, D+2, T)
y = net._forward(z, cf0, uv)
Features
- (2021/05/21): Work well and on training
- (2021/05/21): Follow same input as
ParallelWaveGANGenerater
in kan-bayashi/ParallelWaveGAN but with continuous F1 and uv symbols - (2021/05/24): Final FIR filter is implemented by 1D causal conv layer
- (2021/05/24): GAN training is not stable
- (2021/05/25): Implement reference log magnitude from melspc
- (2021/05/27): Implement internal scaler and ltv conv postfilter
References
@article{liu20,
title={Neural Homomorphic Vocoder},
author={Z.~Liu and K.~Chen and K.~Yu},
journal={Proc. Interspeech 2020},
pages={240--244},
year={2020}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for neural-homomorphic-vocoder-0.0.5.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2380899d5e262ff2c71a482187c85ffa8646739e15807b82382321b9e8319a18 |
|
MD5 | b05ffe20763d08222f9611f4b5c28d82 |
|
BLAKE2b-256 | 7db90da0b815309bb0b3344a1d121714cb37541f67de7b9ee085607ee174310b |
Close
Hashes for neural_homomorphic_vocoder-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9210bc141fd43955ce4b48ef4146c00e5955fbb3b0265306a7a993e5a72c3bcb |
|
MD5 | fbb772c76a498f18adc7a831c3570b55 |
|
BLAKE2b-256 | 204b816c87d5580c592eb8ef1a0cb47c25b06c908e7f8a36422f446fe547cba3 |