PyTorch Speech Feature extraction
Project description
pytorch_speech_features
A simple PyTorch reimplementation of library python_speech_features.
Uses
- Great for Intepretability experiments - All audio processing operations can be performed and the results can be backpropagated to the original signal tensor.
- Supports Hybrid Model Design - Parametric operations at different stages of audio processing.
Installation
Install from PyPI
pip install pytorch-speech-features
Install from GitHub
git clone https://github.com/Debjoy10/pytorch_speech_features
python setup.py develop
Usage
Functions same as python_speech_features (Refer to its documentation here).
Instead of input signal as list / numpy array, pass tensor (both 'cpu' and 'cuda' supported!!).
See example use given above.
Supported features:
- Mel Frequency Cepstral Coefficients
- Filterbank Energies
- Log Filterbank Energies
- Spectral Subband Centroids
Testing
Two things to test for pytorch_speech_features operations -
- Similarity to python_speech_features outputs.
- Gradient correctness via Autograd Gradcheck.
Find the testing python notebook here -
Citation
@misc{https://doi.org/10.5281/zenodo.8021586,
doi = {10.5281/ZENODO.8021586},
url = {https://zenodo.org/record/8021586},
author = {{Debjoy Saha}},
title = {Debjoy10/pytorch_speech_features: Release v0.0.1},
publisher = {Zenodo},
year = {2023},
copyright = {Open Access}
}
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for pytorch_speech_features-0.0.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a4dbffaa7c049b941c40488ffba5c54f272b3bf8d2367197dd2455c0f843607 |
|
MD5 | c2c96032a5878ad2d4b9be5d4f2a97af |
|
BLAKE2b-256 | 6ae6f94f67675673386e28ba66080665007a83cd20180cd0965474f9b95d2d5f |