Skip to main content

A module for audio features extraction from Techmo

Project description

pypi Supported Python versions example workflow

Techmo Sp. z o.o. module for audio features extraction

How to use

:warning: Add ! character if you install the module in a jupyter notebook

pip install techmo-wavelet 

#import functions for feature extraction
from techmo.feature_extraction import calculate_wavelet_fft, calculate_fft_wavelet

# install numpy first in case is not installed in your environment
import numpy as np 

# signal must be 1d array read from wav file, e.x by using Soundfile. Here we generate random signal
signal = np.random.uniform(-1.0, 1.0, 16000)

# Here's an example of how to use `calculate_wavelet_fft` function
features = calculate_wavelet_fft(signal)

# Here's an example of how to use `calculate_fft_wavelet` function
features = calculate_fft_wavelet(signal)

The code implements 2 functions to extract features:

The calculate_wavelet_fft function implements an algorithm consisting of the following stages:

  1. If the number of samples N is greater than or equal to 4800, the signal is divided into int(N/2400) segments to compute finally 60 features for each segment containing int(N/int(N/2400)) samples, i.e. the feature vector will have 60*int(N/2400) elements,
  2. Segments are processed by the Hann window,
  3. Segments are normalized separately,
  4. Each segment is processed by the Wavelet Transform (WT),
  5. Each WT subband is subjected to the Fast Fourier Transform (FFT),
  6. FFT spectra are inputs of the triangular filtration to obtain the feature sub-vectors of length 60 for each segment,
  7. The logarithms of filter outputs are computed to obtain the feature sub-vectors of length 60 for each segment.
  8. Sub-vectors are concatenated to obtain a final feature matrix as numpy ndarray of shape int(N/2400), 60.

The calculate_fft_wavelet function implements an algorithm consisting of the following stages:

  1. If the number of samples N is greater than or equal to 9600, the signal is divided into int(N/4800) segments to compute finally 60 features for each segment containing int(N/int(N/4800)) samples, i.e. the feature vector will have 60*int(N/4800) elements,
  2. Segments are processed by the Hann window,
  3. Segments are normalized separately,
  4. Speech segments are processed by the the Fast Fourier Transform,
  5. The complex spectra are subjected to Wavelet Transform (WT),
  6. Absolute values of WT are calculated,
  7. The computed modules are inputs of the triangular filtration,
  8. The logarithms of filter outputs are computed to obtain the feature sub-vectors of length 60 for each segment.
  9. Sub-vectors are concatenated to obtain a final feature matrix as numpy ndarray of shape int(N/4800), 60.

A detailed presentation of the algorithm is presented in the paper M.Ziołko, M.Kucharski, S.Pałka, B.Ziołko, K.Kaminski, I.Kowalska, A.Szpakowicz, J.Jamiołkowski, M.Chlabicz, M.Witkowski: Fourier-Wavelet Voice Analysis Applied to Medical Screening Tests. Proceedings of the INTERSPEECH 2021 (under review).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

techmo-wavelet-0.3.1.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

techmo_wavelet-0.3.1-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file techmo-wavelet-0.3.1.tar.gz.

File metadata

  • Download URL: techmo-wavelet-0.3.1.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.3.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.9.5

File hashes

Hashes for techmo-wavelet-0.3.1.tar.gz
Algorithm Hash digest
SHA256 85ae582f8f9a19043cc2ef0199d2eee88181d368e5d7d094f1a1807a79ed3c19
MD5 48a48c9f8c10a7cb931968846b806242
BLAKE2b-256 7fe11304971d9f591324cd8dc2e2a3be536d8faa1394c21055e0bd24e16ac70f

See more details on using hashes here.

File details

Details for the file techmo_wavelet-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: techmo_wavelet-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.3.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.9.5

File hashes

Hashes for techmo_wavelet-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 94903d112c48a414480b15172e466728737b975c2566ed47471fec4500b90ab3
MD5 624461e023aed0a933a3671ba4525b50
BLAKE2b-256 4456464cda09054db963f71703f85b3489e4630af4c55cda8a644d2cf27812f3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page