Skip to main content

Audio augmentations library, for audio in the time-domain.

Project description

Audio Augmentations

Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.

Usage

We can define several audio augmentations, which will be applied sequentially to a raw audio waveform:

from audio_augmentations import *

audio, sr = torchaudio.load("tests/classical.00002.wav")

num_samples = sr * 5
transforms = [
    RandomResizedCrop(n_samples=num_samples),
    RandomApply([PolarityInversion()], p=0.8),
    RandomApply([Noise(min_snr=0.3, max_snr=0.5)], p=0.3),
    RandomApply([Gain()], p=0.2),
    RandomApply([HighLowPass(sample_rate=sr)], p=0.8),
    RandomApply([Delay(sample_rate=sr)], p=0.5),
    RandomApply([PitchShift(
        n_samples=num_samples,
        sample_rate=sr
    )], p=0.4),
    RandomApply([Reverb(sample_rate=sr)], p=0.3)
]

We can return either one or many versions of the same audio example:

transform = Compose(transforms=transforms)
transformed_audio =  transform(audio)
>> transformed_audio.shape[0] = 1
audio = torchaudio.load("testing/classical.00002.wav")
transform = ComposeMany(transforms=transforms, num_augmented_samples=4)
transformed_audio = transform(audio)
>> transformed_audio.shape[0] = 4

Similar to the torchvision.datasets interface, an instance of the Compose or ComposeMany class can be supplied to a torchaudio dataloaders that accept transform=.

Optional

Install WavAugment for reverberation / pitch shifting:

pip install git+https://github.com/facebookresearch/WavAugment

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio-augmentations-0.1.3.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

audio_augmentations-0.1.3-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file audio-augmentations-0.1.3.tar.gz.

File metadata

  • Download URL: audio-augmentations-0.1.3.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6

File hashes

Hashes for audio-augmentations-0.1.3.tar.gz
Algorithm Hash digest
SHA256 6f134f32d109c38570b47fb7f0e10646178efbb8df5fa6ad267d66eb99376b70
MD5 841f2e6491fa35ed40e83dc235f92363
BLAKE2b-256 ef1309c5ffdf2c7eae92ad5718fb4d1c4e7fbcbae12dd1ce5b28a9d84474b7ff

See more details on using hashes here.

File details

Details for the file audio_augmentations-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: audio_augmentations-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6

File hashes

Hashes for audio_augmentations-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2c048029c2d565a0b54d0a1bd33df39aab29cee22eb6634353a94a1e2f734b0e
MD5 93d7032cbaa6d6c43561cd918ecb61da
BLAKE2b-256 b89926d6faea82b1465faa3a630ba6de5fcf9639a122810b800c6816fe5090c5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page