Spectra Extraction based on PyTorch
Project description
spectra_torch
Considering the pytorch-kalda is presented, so it is more practical to use it. Also, SpeechBrain, A PyTorch-based Speech Toolkit, is coming. I am looking forward to a nice step on speech. To conclude, this package is used to learn spectra of a signal, so it is valuable at all.
This library provides common spectra features from an audio signal including MFCCs and filter bank energies. This library mimics the library python_speech_features
but PyTorch-style.
This library provides voice activity detection (VAD) based on energy. This library mimics the library VAD-python
but PyTorch-style.
Use: Rui Wang. (2020, March 14). mechanicalsea/spectra: release v0.4.0 (Version 0.4.0).
Installation
This library is avaliable on pypi.org
To install from Pypi:
pip install --upgrade spectra-torch
Require:
- python: 3.7.3
- torch: 1.4.0
- torchaudio: 0.4.0
Usage
Supported features:
- Mel Frequency Cepstral Coefficients (MFCC)
- Filterbank Energies
- Log Filterbank Energies
- Voice Activity Detection (VAD)
Here are examples.
Easy demo:
# Ensure cuda is available.
import spectra_torch.base as mm
import torchaudio as ta
sig, sr = ta.load_wav('piece_20_32k.wav')
sig = sig[0].cuda()
mfcc = mm.mfcc(sig, sr) # MFCC
starts, detection = mm.is_speech(sig, sr, speechlen=0.5) # VAD
Tutorial
Tutorials of MFCC and VAD is provided at notebooks.
Step-by-step description is presented. Welcome to enjoy it.
Performance
The difference between spectra_torch
and python_speech_features
:
- Precision bais: 1e-4
- Speed up: 0.1s/mfcc
MFCC
def mfcc(signal, samplerate=16000, winlen=0.025, hoplen=0.01,
numcep=13, nfilt=26, nfft=None, lowfreq=0, highfreq=None,
preemph=0.97, ceplifter=22, plusEnergy=True)
Filterbank
def fbank(signal, samplerate=16000, winlen=0.025, hoplen=0.01,
nfilt=26, nfft=512, lowfreq=0, highfreq=None, preemph=0.97)
VAD
def is_speech(signal, samplerate=16000, winlen=0.02, hoplen=0.01,
thresEnergy=0.6, speechlen=0.5, lowfreq=300, highfreq=3000,
preemph=0.97)
Reference
python_speeck_features
: https://github.com/jameslyons/python_speech_featuresVAD-python
: https://github.com/marsbroshok/VAD-pythonpythonaudio
: https://pytorch.org/audio/_modules/torchaudio/functional.html
Thanks for you attention.
Free for question to my email (rwang@tongji.edu.cn).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file spectra-torch-0.4.0.tar.gz
.
File metadata
- Download URL: spectra-torch-0.4.0.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
47b038c8f458c27885e1de125aead585906bc8cbcb3bea4d60e8b22c787a50bf
|
|
MD5 |
864c546068eb3847014f1ede64ff13b2
|
|
BLAKE2b-256 |
c27b30e5dd987093b3edc7e8fd68378badd5015215d1caa7a2975052909c0ae3
|
File details
Details for the file spectra_torch-0.4.0-py2.py3-none-any.whl
.
File metadata
- Download URL: spectra_torch-0.4.0-py2.py3-none-any.whl
- Upload date:
- Size: 6.2 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
c81b6eb92f37f4fdc0c6b35538c0c58901842277343f9c4400ae2c716b97b05d
|
|
MD5 |
44cde956667fbb20e0f17a83cb4e8a20
|
|
BLAKE2b-256 |
a68096ee4a58b1759f9fa9aa9a519d39230137a89f60cbf9244d874a299259e5
|