PyTorch Speech Feature extraction
Project description
pytorch_speech_features
A simple PyTorch reimplementation of library python_speech_features.
Uses
- Great for Intepretability experiments - All audio processing operations can be performed and the results can be backpropagated to the original signal tensor.
- Supports Hybrid Model Design - Parametric operations at different stages of audio processing.
Installation
Install from PyPI
pip install pytorch-speech-features
Install from GitHub
git clone https://github.com/Debjoy10/pytorch_speech_features
python setup.py develop
Usage
Functions same as python_speech_features (Refer to its documentation here).
Instead of input signal as list / numpy array, pass tensor (both 'cpu' and 'cuda' supported!!).
See example use given above.
Supported features:
- Mel Frequency Cepstral Coefficients
- Filterbank Energies
- Log Filterbank Energies
- Spectral Subband Centroids
Testing
Two things to test for pytorch_speech_features operations -
- Similarity to python_speech_features outputs.
- Gradient correctness via Autograd Gradcheck.
Find the testing python notebook here -
Citation
@misc{https://doi.org/10.5281/zenodo.8021586,
doi = {10.5281/ZENODO.8021586},
url = {https://zenodo.org/record/8021586},
author = {{Debjoy Saha}},
title = {Debjoy10/pytorch_speech_features: Release v0.0.1},
publisher = {Zenodo},
year = {2023},
copyright = {Open Access}
}
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pytorch_speech_features-0.0.3.tar.gz
.
File metadata
- Download URL: pytorch_speech_features-0.0.3.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a4dbffaa7c049b941c40488ffba5c54f272b3bf8d2367197dd2455c0f843607 |
|
MD5 | c2c96032a5878ad2d4b9be5d4f2a97af |
|
BLAKE2b-256 | 6ae6f94f67675673386e28ba66080665007a83cd20180cd0965474f9b95d2d5f |