Audio augmentations library, for audio in the time-domain.
Project description
Audio Augmentations
Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.
Usage
We can define several audio augmentations, which will be applied sequentially to a raw audio waveform:
from audio_augmentations import *
audio, sr = torchaudio.load("tests/classical.00002.wav")
num_samples = sr * 5
transforms = [
RandomResizedCrop(n_samples=num_samples),
RandomApply([PolarityInversion()], p=0.8),
RandomApply([Noise(min_snr=0.3, max_snr=0.5)], p=0.3),
RandomApply([Gain()], p=0.2),
RandomApply([HighLowPass(sample_rate=sr)], p=0.8),
RandomApply([Delay(sample_rate=sr)], p=0.5),
RandomApply([PitchShift(
n_samples=num_samples,
sample_rate=sr
)], p=0.4),
RandomApply([Reverb(sample_rate=sr)], p=0.3)
]
We can return either one or many versions of the same audio example:
transform = Compose(transforms=transforms)
transformed_audio = transform(audio)
>> transformed_audio.shape[0] = 1
audio = torchaudio.load("testing/classical.00002.wav")
transform = ComposeMany(transforms=transforms, num_augmented_samples=4)
transformed_audio = transform(audio)
>> transformed_audio.shape[0] = 4
Similar to the torchvision.datasets
interface, an instance of the Compose
or ComposeMany
class can be supplied to a torchaudio dataloaders that accept transform=
.
Optional
Install WavAugment for reverberation / pitch shifting:
pip install git+https://github.com/facebookresearch/WavAugment
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for audio-augmentations-0.1.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6f134f32d109c38570b47fb7f0e10646178efbb8df5fa6ad267d66eb99376b70 |
|
MD5 | 841f2e6491fa35ed40e83dc235f92363 |
|
BLAKE2b-256 | ef1309c5ffdf2c7eae92ad5718fb4d1c4e7fbcbae12dd1ce5b28a9d84474b7ff |
Close
Hashes for audio_augmentations-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c048029c2d565a0b54d0a1bd33df39aab29cee22eb6634353a94a1e2f734b0e |
|
MD5 | 93d7032cbaa6d6c43561cd918ecb61da |
|
BLAKE2b-256 | b89926d6faea82b1465faa3a630ba6de5fcf9639a122810b800c6816fe5090c5 |