Skip to main content

PyTorch Speech Feature extraction

Project description

pytorch_speech_features

DOI

A simple PyTorch reimplementation of library python_speech_features.

Uses

  • Great for Intepretability experiments - All audio processing operations can be performed and the results can be backpropagated to the original signal tensor.
  • Supports Hybrid Model Design - Parametric operations at different stages of audio processing.

Example use

Installation

Install from PyPI

pip install pytorch-speech-features

Install from GitHub

git clone https://github.com/Debjoy10/pytorch_speech_features
python setup.py develop

Usage

Functions same as python_speech_features (Refer to its documentation here).

Instead of input signal as list / numpy array, pass tensor (both 'cpu' and 'cuda' supported!!).

See example use given above.

Supported features:

  • Mel Frequency Cepstral Coefficients
  • Filterbank Energies
  • Log Filterbank Energies
  • Spectral Subband Centroids

Testing

Two things to test for pytorch_speech_features operations -

  1. Similarity to python_speech_features outputs.
  2. Gradient correctness via Autograd Gradcheck.
Find the testing python notebook here -

Open In Colab

Citation

@misc{https://doi.org/10.5281/zenodo.8021586,
  doi = {10.5281/ZENODO.8021586},
  url = {https://zenodo.org/record/8021586},
  author = {{Debjoy Saha}},
  title = {Debjoy10/pytorch_speech_features: Release v0.0.1},
  publisher = {Zenodo},
  year = {2023},
  copyright = {Open Access}
}

References

  • Python_speech_features library - Link
  • Sample english.wav - Link

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch_speech_features-0.0.3.tar.gz (8.7 kB view details)

Uploaded Source

File details

Details for the file pytorch_speech_features-0.0.3.tar.gz.

File metadata

File hashes

Hashes for pytorch_speech_features-0.0.3.tar.gz
Algorithm Hash digest
SHA256 7a4dbffaa7c049b941c40488ffba5c54f272b3bf8d2367197dd2455c0f843607
MD5 c2c96032a5878ad2d4b9be5d4f2a97af
BLAKE2b-256 6ae6f94f67675673386e28ba66080665007a83cd20180cd0965474f9b95d2d5f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page