A module for audio features extraction from Techmo
Project description
Techmo Sp. z o.o. module for audio features extraction
How to use
:warning: Add !
character if you install the module in a jupyter notebook
pip install techmo-wavelet
#import functions for feature extraction
from techmo.feature_extraction import calculate_wavelet_fft, calculate_fft_wavelet
# install numpy first in case is not installed in your environment
import numpy as np
# signal must be 1d array read from wav file, e.x by using Soundfile. Here we generate random signal
signal = np.random.uniform(-1.0, 1.0, 16000)
# Here's an example of how to use `calculate_wavelet_fft` function
features = calculate_wavelet_fft(signal)
# Here's an example of how to use `calculate_fft_wavelet` function
features = calculate_fft_wavelet(signal)
The code implements 2 functions to extract features:
The calculate_wavelet_fft
function implements an algorithm consisting of the following stages:
- If the number of samples N is greater than or equal to 4800, the signal is divided into int(N/2400) segments to compute finally 60 features for each segment containing int(N/int(N/2400)) samples, i.e. the feature vector will have 60*int(N/2400) elements,
- Segments are processed by the Hann window,
- Segments are normalized separately,
- Each segment is processed by the Wavelet Transform (WT),
- Each WT subband is subjected to the Fast Fourier Transform (FFT),
- FFT spectra are inputs of the triangular filtration to obtain the feature sub-vectors of length 60 for each segment,
- The logarithms of filter outputs are computed to obtain the feature sub-vectors of length 60 for each segment.
- Sub-vectors are concatenated to obtain a final feature matrix as numpy ndarray of shape int(N/2400), 60.
The calculate_fft_wavelet
function implements an algorithm consisting of the following stages:
- If the number of samples N is greater than or equal to 9600, the signal is divided into int(N/4800) segments to compute finally 60 features for each segment containing int(N/int(N/4800)) samples, i.e. the feature vector will have 60*int(N/4800) elements,
- Segments are processed by the Hann window,
- Segments are normalized separately,
- Speech segments are processed by the the Fast Fourier Transform,
- The complex spectra are subjected to Wavelet Transform (WT),
- Absolute values of WT are calculated,
- The computed modules are inputs of the triangular filtration,
- The logarithms of filter outputs are computed to obtain the feature sub-vectors of length 60 for each segment.
- Sub-vectors are concatenated to obtain a final feature matrix as numpy ndarray of shape int(N/4800), 60.
A detailed presentation of the algorithm is presented in the paper M.Ziołko, M.Kucharski, S.Pałka, B.Ziołko, K.Kaminski, I.Kowalska, A.Szpakowicz, J.Jamiołkowski, M.Chlabicz, M.Witkowski: Fourier-Wavelet Voice Analysis Applied to Medical Screening Tests. Proceedings of the INTERSPEECH 2021 (under review).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file techmo-wavelet-0.3.1.tar.gz
.
File metadata
- Download URL: techmo-wavelet-0.3.1.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.3.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85ae582f8f9a19043cc2ef0199d2eee88181d368e5d7d094f1a1807a79ed3c19 |
|
MD5 | 48a48c9f8c10a7cb931968846b806242 |
|
BLAKE2b-256 | 7fe11304971d9f591324cd8dc2e2a3be536d8faa1394c21055e0bd24e16ac70f |
File details
Details for the file techmo_wavelet-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: techmo_wavelet-0.3.1-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.3.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94903d112c48a414480b15172e466728737b975c2566ed47471fec4500b90ab3 |
|
MD5 | 624461e023aed0a933a3671ba4525b50 |
|
BLAKE2b-256 | 4456464cda09054db963f71703f85b3489e4630af4c55cda8a644d2cf27812f3 |