Skip to main content

A TensorFlow-based implementation of Spectral Gating, an algorithm for denoising audio signals

Project description

TensorFlowSpectralGating

TFSpectralGate is a TensorFlow-based implementation of Spectral Gating, an algorithm for denoising audio signals. It is designed to inherit from the tf.keras.Model class, which allows it to be used either as a standalone module or as part of a larger neural network architecture.

The algorithm was originally proposed by Sainburg et al [1] and was previously implemented in a GitHub repository [2]. The current implementation was developed in TensorFlow to improve computational efficiency and reduce run time.

[1] Sainburg, Tim, and Timothy Q. Gentner. “Toward a Computational Neuroethology of Vocal Communication: From Bioacoustics to Neurophysiology, Emerging Tools and Future Directions.” Frontiers in Behavioral Neuroscience, vol. 15, 2021. Frontiers, https://www.frontiersin.org/articles/10.3389/fnbeh.2021.811737.

[2] Sainburg, T. (2019). noise-reduction. GitHub. Retrieved from https://github.com/timsainb/noisereduce.


Installation

pip install tfgating

Environment

Tested on:

Python 3.10
matplotlib==3.7.1
numpy==1.24.2
soundfile==0.11.0
tensorflow==2.15.0

Please note that TensorFlowSpectralGating may work on other versions of the above dependencies, but these are the versions that were tested.


TensorFlowGating Class

Class for performing parallel spectral gating.

Usage

import tensorflow
from tfgating import TFGating as TG
gpus = tf.config.experimental.list_physical_devices('GPU')
device = "/GPU:0" if gpus else "/CPU:0"

# Create TensorFlowGating instance
tg = TG(sr=8000, nonstationary=True)

# Apply Spectral Gate to noisy speech signal
noisy_speech = tf.random.normal(shape=(3, 32000), dtype=tf.float32, seed=42)
with tf.device(device):
    enhanced_speech = tg(noisy_speech).numpy()

Parameters

Parameter Description
sr Sample rate of the input signal.
n_fft The size of the FFT.
hop_length The number of samples between adjacent STFT columns.
win_length The window size for the STFT. If None, defaults to n_fft.
freq_mask_smooth_hz The frequency smoothing width in Hz for the masking filter. If None, no frequency masking is applied.
time_mask_smooth_ms The time smoothing width in milliseconds for the masking filter. If None, no time masking is applied.
n_std_thresh_stationary The number of standard deviations above the noise mean to consider as signal for stationary noise.
nonstationary Whether to use non-stationary noise masking.
n_movemean_nonstationary The number of frames to use for the moving average in the non-stationary noise mask.
n_thresh_nonstationary The multiplier to apply to the sigmoid function in the non-stationary noise mask.
temp_coeff_nonstationary The temperature coefficient to apply to the sigmoid function in the non-stationary noise mask.
prop_decrease The proportion of decrease to apply to the mask.

Command-Line Interface

The "run.py" script provides a command-line interface for applying the SpectralGate algorithm to audio files. The program will apply the SpectralGate algorithm to all audio files in the input directory, or to the single audio file specified by 'input', and save the processed files in the output directory. If the --graphs option is enabled, the program will also display input and output spectrograms as plots.

Usage

Here is an example of how to use the command line interface:

tfgating <input_path> --output <output_path> --nonstationary --verbose --norm --graphs --subdirs

Arguments

The script takes the following arguments:

  • input: Path to a directory containing audio files or to a single audio file.
  • --output: Path to a directory to save the output files (default: 'output').
  • --nonstationary: Whether to use non-stationary or stationary masking (default: False).
  • --verbose: Flag indicating whether verbose mode is enabled (default: False).
  • --cpu: Flag indicating whether to run the algorithm on CPU instead of GPU (default: False).
  • --subdirs: Whether to create a subdirectory for stationary or non-stationary outputs (default: False).
  • --norm: Whether to normalize the signals (default: False).
  • --graphs: Flag indicating whether to display input and output spectrograms as plots (default: False).
  • --figsize: Figure size for the spectrogram plots in inches (default: (10, 6)).
  • --figformat: If figformat is set, it determines the output format (default: png).
  • --vmin: Minimum value for the color scale in dB (default: -80).
  • --vmax: Maximum value for the color scale in dB (default: None).
  • --cmap: Name of the colormap to use for the spectrogram plots (default: 'magma').

Implementation Scheme

TensorFlowSpectralGate supports both stationary and non-stationary noise reduction. To enable parallel computation, a few modifications were made to the original algorithm. In the non-stationary spectral gating, an FIR filter was implemented instead of an IIR filter.

Spectral Gating

TF-Mask can be estimated using stationary and non-stationary methods. Spectral Gating

Stationary Mask Estimation

Stationary Mask)

Non-Stationary Mask Estimation

Non-Stationary Mask


Run Time Comparison

TBD

Example Results

For the evaluation, a speech utterance was taken from the NOIZEUS database [3], a repository of noisy speech corpus.

The sentence 'sp09.wav' was degraded with car noise. This was done through the addition of interfering signals at signal-to-noise ratios ranging from 0 to 15 dB, using method B of the ITU-T P.56.

[3] Hu, Y. and Loizou, P. (2007). “Subjective evaluation and comparison of speech enhancement algorithms,” Speech Communication, 49, 588-601.

Stationary Spectral Gating

Non-Stationary Spectral Gating

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tfgating-0.1.0a0.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

tfgating-0.1.0a0-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file tfgating-0.1.0a0.tar.gz.

File metadata

  • Download URL: tfgating-0.1.0a0.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for tfgating-0.1.0a0.tar.gz
Algorithm Hash digest
SHA256 72870d15fba68429c515909a6132f0f860c4f38ac6a14a8b328f7398ce108e6e
MD5 d55e27b39c489cec32fa5db1a85e411f
BLAKE2b-256 d2390016e2065b01edb5f101258cd3fad39c79577112e6cb1fd33ba82d5ea845

See more details on using hashes here.

File details

Details for the file tfgating-0.1.0a0-py3-none-any.whl.

File metadata

  • Download URL: tfgating-0.1.0a0-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for tfgating-0.1.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 d984284b57ad1eb9dceb0c879f25af5c0b14854139fdf90444598dac903d6e51
MD5 a63eeaf53d8453d7d17de7f4d2fcfc79
BLAKE2b-256 5c75d300a0f757ca727b0b83ff7bf3193c27091ebbe20101f1a840663f592f91

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page