Skip to main content

Convenience functions for generating ML features from audio data

Project description

Audio ML Spec Tools

Convenience functions for generating ML features from audio data. Breaks audio ML dependencies on torchaudio. Unlike pytorch features, these functions can be exported to ExecuTorch and ONNX with no issues.

Motivation

Except in specific circumstances like wav2vec, raw audio has proven to be a much worse input for ML models than spectrogram-based features across a wide variety of problem domains, including environmental sound classificarion (Guzhov et al. (2021)), singing technique classification (Yamamoto et al. (2021)), and ship classification (Xie, Ren, and Xu (2024)).

There is no scientific consensus on the relative benefits of mel-scale spectrograms, linear spectrograms, and MFCCs. Different researchers have shown good results with each type of spectrogram; see respectively Raponi, Oligeri, and Ali (2021), Jung at al. (2021), and Razani et al (2017).

With this library, you can easily try as many feature extraction methods as you want to see what works for your use case.

Prerequisites

  • Python 3.12 runtime
  • pip for package installation
  • Note that torchcodec depends on a system installation of FFmpeg

Installation

  • Install using pip:
pip install AudioMlSpecTools

Local Installation

Install the dependencies and library with pip:

pip install .

Usage

See examples/features.py.

Testing

# If needed, install test dependencies
# pip install .[test]

python3 -m coverage run -m unittest discover -s test -p "*_test.py" && python -m coverage report --skip-covered
python -m coverage html

Versioning

We use SemVer for versioning. For the versions available, see the tags on this repository.

Authors

  • Ryan Quinn - Initial work

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiomlspectools-0.9.0.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

audiomlspectools-0.9.0-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file audiomlspectools-0.9.0.tar.gz.

File metadata

  • Download URL: audiomlspectools-0.9.0.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for audiomlspectools-0.9.0.tar.gz
Algorithm Hash digest
SHA256 05c4b0bf502804662f031a8621ef67534897bde9e49824b98aed63ea2b8ccc7a
MD5 22748fac6982ccc0f1712c54aba4bad8
BLAKE2b-256 6ccc38545dc7aed6857658f7376cf747cccbe5d53b4a5e9eb283d0b409a3a5e0

See more details on using hashes here.

File details

Details for the file audiomlspectools-0.9.0-py3-none-any.whl.

File metadata

File hashes

Hashes for audiomlspectools-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ead1c1706d2c7ac6285e0960e57919bb975fd28310b011077eec80d074cffc23
MD5 f8825cc95c7e5b13e73998f0a27d8453
BLAKE2b-256 a90a4838d2e68e004728fd526103ed05588bf0e9f8c1a94002b12bada4deea6a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page