Skip to main content

Computes Short Term Objective Intelligibility in PyTorch

Project description

PyTorch implementation of STOI

Build Status PyPI Status

Implementation of the classical and extended Short Term Objective Intelligibility in PyTorch. See also Cees Taal's website and the python implementation

Install

pip install torch_stoi

Important warning

This implementation is intended to be used as a loss function only.
It doesn't replicate the exact behavior of the original metrics but the results should be close enough that it can be used as a loss function. See the Notes in the NegSTOILoss class.

Quantitative comparison coming soon hopefully :rocket:

Usage

import torch
from torch import nn
from torch_stoi import NegSTOILoss

sample_rate = 16000
loss_func = NegSTOILoss(sample_rate=sample_rate)
# Your nnet and optimizer definition here
nnet = nn.Module()

noisy_speech = torch.randn(2, 16000)
clean_speech = torch.randn(2, 16000)
# Estimate clean speech
est_speech = nnet(noisy_speech)
# Compute loss and backward (then step etc...)
loss_batch = loss_func(est_speech, clean_speech)
loss_batch.mean().backward()

Comparing NumPy and PyTorch versions : the static test

Values obtained with the NumPy version (commit 84b1bd8) are compared to the PyTorch version in the following graphs.

8kHz

Classic STOI measure

Extended STOI measure

16kHz

Classic STOI measure

Extended STOI measure

16kHz signals used to compare both versions contained a lot of silence, which explains why the match is very bad without VAD.

Comparing NumPy and PyTorch versions : Training a DNN

Coming in the near future

References

  • [1] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'A Short-Time Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech', ICASSP 2010, Texas, Dallas.
  • [2] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'An Algorithm for Intelligibility Prediction of Time-Frequency Weighted Noisy Speech', IEEE Transactions on Audio, Speech, and Language Processing, 2011.
  • [3] J. Jensen and C. H. Taal, 'An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers', IEEE Transactions on Audio, Speech and Language Processing, 2016.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torch_stoi-0.2.3.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

torch_stoi-0.2.3-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file torch_stoi-0.2.3.tar.gz.

File metadata

  • Download URL: torch_stoi-0.2.3.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for torch_stoi-0.2.3.tar.gz
Algorithm Hash digest
SHA256 228207b8d63548336c5520b156f1e6b30d3ae3db1fb3c41999f01aee087c5f85
MD5 6a1c242a8ba7b6b75fb67406059355f2
BLAKE2b-256 47a423be9a35b6c5b5c4547d24c52b07f4a5d55406a5ae43d9cce09f2352d75a

See more details on using hashes here.

File details

Details for the file torch_stoi-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: torch_stoi-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for torch_stoi-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6eee85e33b42fe843a2150de46000f72e7b87cbeb19ae6ab9bbd94b6ec6b3cd2
MD5 eac11a751095e045130d130920a8c8c2
BLAKE2b-256 a392ead346e904390a53e71b5da2df7e7839abb16e967ba07fa15addf1f9f37c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page