Skip to main content

Pytorch implementation of neural homomorphic vocoder

Project description

CI PyPI version Downloads

neural-homomorphic-vocoder

A neural vocoder based on source-filter model called neural homomorphic vocoder

Install

pip install neural-homomorphic-vocoder

Usage

Usage for NeuralHomomorphicVocoder class

  • Input
    • x: mel-filterbank
    • cf0: continuous f0
    • uv: u/v symbol
import torch
from nhv import NeuralHomomorphicVocoder

net = NeuralHomomorphicVocoder(
        fs=24000,             # sampling frequency
        fft_size=1024,        # size for impuluse responce of LTV
        hop_size=256,         # hop size in each mel-filterbank frame
        in_channels=80,       # input channels (i.e., dimension of mel-filterbank)
        conv_channels=256,    # channel size of LTV filter
        ltv_out_channels=222, # output size of LTV filter
        out_channels=1,       # output size of network
        kernel_size=3,        # kernel size of LTV filter
        group_size=8,         # group size of LTV filter
        dilation_size=1,      # dilation size of LTV filter
        fmin=80,              # min freq. of melspc calculation
        fmax=7600,            # max freq. of melspc calculation (recommend to use full-band)
        roll_size=24,         # frame size to get median to estimate logspc from melspc
        look_ahead=32,        # # of look_ahead samples (if use_causal=True)
        use_causal=False,     # use causal conv LTV filter
        use_ddsconv=False,    # use ddsconv instead of normal conv for LTV network
        use_tanh=False,       # apply tanh to output else linear
        use_conv_postfilter=False,     # use causal conv postfilter for NHV output
        use_ddsconv_pf=True,           # use ddsconv postfilter instead of conv1d
        use_ltv_conv_postfilter=False, # use causal conv postfilter for LTV output
        use_reference_mag=False,       # use reference logspc calculated from melspc
        use_quefrency_norm=True,       # enable ccep normalized by quefrency index
        use_weight_norm=False,         # apply weight norm to conv1d layer
        use_clip_grad_norm=False,      # use clip_grad_norm (norm_value=3)
        scaler_file=None      # path to .pkl for internal scaling of melspc
                              # (dict["mlfb"] = sklearn.preprocessing.StandardScaler)
)

B, T, D = 3, 100, in_channels   # batch_size, frame_size, n_mels
z = torch.randn(B, 1, T * hop_size)
x = torch.randn(B, T, D)
cf0 = torch.randn(B, T, 1)
uv = torch.randn(B, T, 1)
y = net(z, torch.cat([x, cf0, uv], dim=-1))   # z: (B, 1, T * hop_size), c: (B, D+2, T)
y = net._forward(z, cf0, uv)

Features

  • (2021/05/21): Train using kan-bayashi/ParallelWaveGAN with continuous F1 and uv symbols
  • (2021/05/24): Final FIR filter is implemented by 1D causal conv
  • (2021/06/17): Implement depth-wise separable convolution

References

@article{liu20,
  title={Neural Homomorphic Vocoder},
  author={Z.~Liu and K.~Chen and K.~Yu},
  journal={Proc. Interspeech 2020},
  pages={240--244},
  year={2020}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural-homomorphic-vocoder-0.0.7.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neural_homomorphic_vocoder-0.0.7-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file neural-homomorphic-vocoder-0.0.7.tar.gz.

File metadata

  • Download URL: neural-homomorphic-vocoder-0.0.7.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for neural-homomorphic-vocoder-0.0.7.tar.gz
Algorithm Hash digest
SHA256 6c876a289bde74a2df139d84992cbbccd7c90f34eb41467bd271ad7fec8a3d17
MD5 1eaffb216459fdcc85a30d33242ff34b
BLAKE2b-256 f57b23fa51eedde925f84548f09c23784a15b2a4230cb02beee79ef449170afa

See more details on using hashes here.

File details

Details for the file neural_homomorphic_vocoder-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: neural_homomorphic_vocoder-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for neural_homomorphic_vocoder-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 7198d04d42b9cd7005e7b575cfb6aefbd5bcb60655723e6a38d95c63d5615292
MD5 7e68947f8a9badb9364b8f453f19f86e
BLAKE2b-256 b3e5facf43315f37adf0098f5dbac49fc67148703aafcdfdfa7b5570da6bfe7f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page