Pytorch implementation of neural homomorphic vocoder

Project description

neural-homomorphic-vocoder

A neural vocoder based on source-filter model called neural homomorphic vocoder

Install

pip install neural-homomorphic-vocoder

Usage

Usage for NeuralHomomorphicVocoder class

Input
- x: mel-filterbank
- cf0: continuous f0
- uv: u/v symbol

import torch
from nhv import NeuralHomomorphicVocoder

net = NeuralHomomorphicVocoder(
        fs=24000,             # sampling frequency
        fft_size=1024,        # size for impuluse responce of LTV
        hop_size=256,         # hop size in each mel-filterbank frame
        in_channels=80,       # input channels (i.e., dimension of mel-filterbank)
        conv_channels=256,    # channel size of LTV filter
        ccep_size=222,        # output ccep size of LTV filter      
        out_channels=1,       # output size of network
        ccep_size=222,        # output size of LTV filter
        kernel_size=3,        # kernel size of LTV filter
        dilation_size=1,      # dilation size of LTV filter
        group_size=8,         # group size of LTV filter
        fmin=80,              # min freq. for melspc 
        fmax=7600,            # max freq. for melspc (recommend to use full-band)
        roll_size=24,         # frame size to get median to estimate logspc from melspc
        look_ahead=32,        # # of look_ahead samples (if use_causal=True)
        n_ltv_layers=3,       # # layers for LTV ccep generator
        n_postfilter_layers=4,   # # layers for output postfilter 
        use_causal=False,        # use causal conv LTV filter
        use_reference_mag=False, # use reference logspc calculated from melspc
        use_tanh=False,       # apply tanh to output else linear
        use_uvmask=False,     # apply uv-based mask to harmonic
        use_weight_norm=True, # apply weight norm to conv1d layer
        conv_type="original"  # ltv generator network type ["original", "ddsconv"]
        postfilter_type=None, # postfilter network type ["None", "normal", "ddsconv"]
        ltv_postfilter_type="conv",  # ltv postfilter network type \
                                     # ["None", "normal", "ddsconv"]
        scaler_file=None      # path to .pkl for internal scaling of melspc
                              # (dict["mlfb"] = sklearn.preprocessing.StandardScaler)
)

B, T, D = 3, 100, in_channels   # batch_size, frame_size, n_mels
z = torch.randn(B, 1, T * hop_size)
x = torch.randn(B, T, D)
cf0 = torch.randn(B, T, 1)
uv = torch.randn(B, T, 1)
y = net(z, torch.cat([x, cf0, uv], dim=-1))   # z: (B, 1, T * hop_size), c: (B, D+2, T)
y = net._forward(z, cf0, uv)

Features

(2021/05/21): Train using kan-bayashi/ParallelWaveGAN with continuous F1 and uv symbols
(2021/05/24): Final FIR filter is implemented by 1D causal conv
(2021/06/17): Implement depth-wise separable convolution

References

@article{liu20,
  title={Neural Homomorphic Vocoder},
  author={Z.~Liu and K.~Chen and K.~Yu},
  journal={Proc. Interspeech 2020},
  pages={240--244},
  year={2020}
}

Project details

Release history Release notifications | RSS feed

0.0.13

Jul 27, 2021

0.0.12

Jul 26, 2021

0.0.11

Jul 2, 2021

0.0.10

Jul 1, 2021

This version

0.0.8

Jun 21, 2021

0.0.7

Jun 17, 2021

0.0.5

May 27, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural-homomorphic-vocoder-0.0.8.tar.gz (10.8 kB view details)

Uploaded Jun 21, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

neural_homomorphic_vocoder-0.0.8-py3-none-any.whl (11.2 kB view details)

Uploaded Jun 21, 2021 Python 3

File details

Details for the file neural-homomorphic-vocoder-0.0.8.tar.gz.

File metadata

Download URL: neural-homomorphic-vocoder-0.0.8.tar.gz
Upload date: Jun 21, 2021
Size: 10.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for neural-homomorphic-vocoder-0.0.8.tar.gz
Algorithm	Hash digest
SHA256	`57a5c7b04a0bb4f83ecd1916862159254d382bb5eac1a2aad348d6d7246bad7b`
MD5	`4feec1a34c6e4187ee7808f34ab14b4c`
BLAKE2b-256	`1c94e76818bc8848e65f88828b54cab11533675b69a1fce0006d89a7bd189173`

See more details on using hashes here.

File details

Details for the file neural_homomorphic_vocoder-0.0.8-py3-none-any.whl.

File metadata

Download URL: neural_homomorphic_vocoder-0.0.8-py3-none-any.whl
Upload date: Jun 21, 2021
Size: 11.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.5

File hashes

Hashes for neural_homomorphic_vocoder-0.0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7891242ede422b2f11d2309bfb99811b73e80158e0d3339e9b6f48a833e9105b`
MD5	`050dc34882e05064e136566e275de15f`
BLAKE2b-256	`9d0d24d5985b780b9373bf1830b5c3c9058b32aff92a551ba7dc947a05728a9e`

See more details on using hashes here.

neural-homomorphic-vocoder 0.0.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

neural-homomorphic-vocoder

Install

Usage

Features

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes