SincNet - Tensorflow
Project description
SincNet in Tensorflow
An Implementation of SincNet using Tenorflow 2.x.
- Models are converted from original torch networks.
- The main implementation of the sinc_conv layer is non-optimal. Instead of using loops in the call section, we used matrix multiplication and a few programming tricks that allow the hardware to run more efficiently (25 times faster).
SincNet
SincNet is a neural architecture for processing raw audio samples. It is a novel Convolutional Neural Network (CNN) that encourages the first convolutional layer to discover more meaningful filters. SincNet is based on parametrized sinc functions, which implement band-pass filters. Arxiv
Install
$ pip install sincnet-tensorflow
Usage
Demo
Training on a dummy database to check for error-free execution
A layer for Keras Functional
import tensorflow as tf
from tensorflow.keras.layers import Dense, Conv1D
from tensorflow.keras.layers import LeakyReLU, BatchNormalization, Flatten, MaxPooling1D, Input
from sincnet_tensorflow import SincConv1D, LayerNorm
out_dim = 50 #number of classes
sinc_layer = SincConv1D(N_filt=64,
Filt_dim=129,
fs=16000,
stride=16,
padding="SAME")
inputs = Input((32000, 1))
x = sinc_layer(inputs)
x = LayerNorm()(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)
x = Conv1D(64, 3, strides=1, padding='valid')(x)
x = BatchNormalization(momentum=0.05)(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)
x = Conv1D(64, 3, strides=1, padding='valid')(x)
x = BatchNormalization(momentum=0.05)(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)
x = Conv1D(128, 3, strides=1, padding='valid')(x)
x = BatchNormalization(momentum=0.05)(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)
x = Conv1D(128, 3, strides=1, padding='valid')(x)
x = BatchNormalization(momentum=0.05)(x)
x = LeakyReLU(alpha=0.2)(x)
x = MaxPooling1D(pool_size=2)(x)
x = Flatten()(x)
x = Dense(256)(x)
x = BatchNormalization(momentum=0.05, epsilon=1e-5)(x)
x = LeakyReLU(alpha=0.2)(x)
x = Dense(256)(x)
x = BatchNormalization(momentum=0.05, epsilon=1e-5)(x)
x = LeakyReLU(alpha=0.2)(x)
prediction = Dense(out_dim, activation='softmax')(x)
model = tf.keras.models.Model(inputs=inputs, outputs=prediction)
model.summary()
References
@inproceedings{ravanelli2018speaker,
title={Speaker recognition from raw waveform with sincnet},
author={Ravanelli, Mirco and Bengio, Yoshua},
booktitle={2018 IEEE Spoken Language Technology Workshop (SLT)},
pages={1021--1028},
year={2018},
organization={IEEE}
}
@misc{SincNet,
title = {SincNet},
author = {Mirco Ravanelli (mravanelli)},
year = {2018},
url = {https://github.com/mravanelli/SincNet},
publisher = {Github},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file sincnet-tensorflow-0.0.2.tar.gz.
File metadata
- Download URL: sincnet-tensorflow-0.0.2.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6cf66f95bceba1ecdd75b60b70be53e7dbdee785a192a394aa77ed90cd368bf
|
|
| MD5 |
33e4a02054a90f194995f92822c97ddb
|
|
| BLAKE2b-256 |
1cbed5b0de1b1e8b99ca1f34b435a80c579d79cca85b39e916a8639a8a4b55a3
|