Audio augmentations library for PyTorch, for audio in the time-domain.
Project description
Audio Augmentations
Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.
Usage
We can define several audio augmentations, which will be applied sequentially to a raw audio waveform:
from audio_augmentations import *
audio, sr = torchaudio.load("tests/classical.00002.wav")
num_samples = sr * 5
transforms = [
RandomResizedCrop(n_samples=num_samples),
RandomApply([PolarityInversion()], p=0.8),
RandomApply([Noise(min_snr=0.3, max_snr=0.5)], p=0.3),
RandomApply([Gain()], p=0.2),
RandomApply([HighLowPass(sample_rate=sr)], p=0.8),
RandomApply([Delay(sample_rate=sr)], p=0.5),
RandomApply([PitchShift(
n_samples=num_samples,
sample_rate=sr
)], p=0.4),
RandomApply([Reverb(sample_rate=sr)], p=0.3)
]
We can return either one or many versions of the same audio example:
transform = Compose(transforms=transforms)
transformed_audio = transform(audio)
>> transformed_audio.shape[0] = 1
audio = torchaudio.load("testing/classical.00002.wav")
transform = ComposeMany(transforms=transforms, num_augmented_samples=4)
transformed_audio = transform(audio)
>> transformed_audio.shape[0] = 4
Similar to the torchvision.datasets
interface, an instance of the Compose
or ComposeMany
class can be supplied to a torchaudio dataloaders that accept transform=
.
Optional
Install WavAugment for reverberation / pitch shifting:
pip install git+https://github.com/facebookresearch/WavAugment
Cite
You can cite this work with the following BibTeX:
@misc{spijkervet_torchaudio_augmentations,
doi = {10.5281/ZENODO.4748582},
url = {https://zenodo.org/record/4748582},
author = {Spijkervet, Janne},
title = {Spijkervet/torchaudio-augmentations},
publisher = {Zenodo},
year = {2021},
copyright = {MIT License}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file torchaudio-augmentations-0.1.6.tar.gz
.
File metadata
- Download URL: torchaudio-augmentations-0.1.6.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c7ff0a34a846094d9ed67b2394f3427aadf39f5c48cd7cbe4c52bc4c799febf |
|
MD5 | e4b999da41b7c44d68f7e979cdf98acf |
|
BLAKE2b-256 | 2fe4ce551522574693fd92b5e4d31fc5b7309af7b45c89acdfdec38fe52717f3 |
File details
Details for the file torchaudio_augmentations-0.1.6-py3-none-any.whl
.
File metadata
- Download URL: torchaudio_augmentations-0.1.6-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0d9dc0e68b94cd134dc747d5d2df4910097b9b633166f3d2c3dcba334c31cfe |
|
MD5 | c4112098181068278293291ce13ce856 |
|
BLAKE2b-256 | 5c3e69d659535e244b370637bf32e48dd58b35914161f430bf7f97e8d2fbdc36 |