Skip to main content

Spectra Extraction based on PyTorch

Project description

spectra_torch

Considering the pytorch-kalda is presented, so it is more practical to use it. Also, SpeechBrain, A PyTorch-based Speech Toolkit, is coming. I am looking forward to a nice step on speech. To conclude, this package is used to learn spectra of a signal, so it is valuable at all.

This library provides common spectra features from an audio signal including MFCCs and filter bank energies. This library mimics the library python_speech_features but PyTorch-style.

This library provides voice activity detection (VAD) based on energy. This library mimics the library VAD-python but PyTorch-style.

Use: Rui Wang. (2020, March 14). mechanicalsea/spectra: release v0.4.0 (Version 0.4.0).

Installation

This library is avaliable on pypi.org

To install from Pypi:

pip install --upgrade spectra-torch

Require:

  • python: 3.7.3
  • torch: 1.4.0
  • torchaudio: 0.4.0

Usage

Supported features:

  • Mel Frequency Cepstral Coefficients (MFCC)
  • Filterbank Energies
  • Log Filterbank Energies
  • Voice Activity Detection (VAD)

Here are examples.

Easy demo:

# Ensure cuda is available.
import spectra_torch.base as mm
import torchaudio as ta

sig, sr = ta.load_wav('piece_20_32k.wav')
sig = sig[0].cuda()
mfcc = mm.mfcc(sig, sr) # MFCC
starts, detection = mm.is_speech(sig, sr, speechlen=0.5) # VAD

Tutorial

Tutorials of MFCC and VAD is provided at notebooks.

Step-by-step description is presented. Welcome to enjoy it.

Performance

The difference between spectra_torch and python_speech_features:

  • Precision bais: 1e-4
  • Speed up: 0.1s/mfcc

MFCC

def mfcc(signal, samplerate=16000, winlen=0.025, hoplen=0.01, 
         numcep=13, nfilt=26, nfft=None, lowfreq=0, highfreq=None, 
         preemph=0.97, ceplifter=22, plusEnergy=True)

Filterbank

def fbank(signal, samplerate=16000, winlen=0.025, hoplen=0.01, 
          nfilt=26, nfft=512, lowfreq=0, highfreq=None, preemph=0.97)

VAD

def is_speech(signal, samplerate=16000, winlen=0.02, hoplen=0.01, 
              thresEnergy=0.6, speechlen=0.5, lowfreq=300, highfreq=3000, 
              preemph=0.97)

Reference

Thanks for you attention.

Free for question to my email (rwang@tongji.edu.cn).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for spectra-torch, version 0.4.0
Filename, size File type Python version Upload date Hashes
Filename, size spectra_torch-0.4.0-py2.py3-none-any.whl (6.2 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size spectra-torch-0.4.0.tar.gz (5.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page