Skip to main content

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Project description

Audiomentations

Build status Code coverage Code Style: Black Licence: MIT

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Setup

Python version support PyPI version Number of downloads from PyPI per month

pip install audiomentations

Usage example

from audiomentations import Compose, AddGaussianNoise, TimeStretch, PitchShift, Shift
import numpy as np

SAMPLE_RATE = 16000

augmenter = Compose([
    AddGaussianNoise(min_amplitude=0.001, max_amplitude=0.015, p=0.5),
    TimeStretch(min_rate=0.8, max_rate=1.25, p=0.5),
    PitchShift(min_semitones=-4, max_semitones=4, p=0.5),
    Shift(min_fraction=-0.5, max_fraction=0.5, p=0.5),
])

samples = np.zeros((20,), dtype=np.float32)
samples = augmenter(samples=samples, sample_rate=SAMPLE_RATE)

Go to audiomentations/augmentations/transforms.py to see which transforms you can apply.

Version history

v0.10.1 (2020-07-27)

  • Improve the performance of AddBackgroundNoise and AddShortNoises by optimizing the implementation of calculate_rms.
  • Improve compatibility of output files written by the demo script. Thanks to xwJohn.
  • Fix division by zero bug in Normalize. Thanks to ZFTurbo.

v0.10.0 (2020-05-05)

  • Breaking change: AddImpulseResponse, AddBackgroundNoise and AddShortNoises now include subfolders when searching for files. This is useful when your sound files are organized in subfolders.
  • AddImpulseResponse, AddBackgroundNoise and AddShortNoises now support aiff files in addition to flac, mp3, ogg and wav
  • Fix filter instability bug in FrequencyMask. Thanks to kvilouras.

v0.9.0 (2020-02-20)

  • Disregard non-audio files when looking for impulse response files
  • Remember randomized/chosen effect parameters. This allows for freezing the parameters and applying the same effect to multiple sounds. Use transform.freeze_parameters() and transform.unfreeze_parameters() for this.
  • Fix a bug in ClippingDistortion where the min_percentile_threshold was not respected as expected.
  • Implement transform.serialize_parameters(). Useful for when you want to store metadata on how a sound was perturbed.
  • Switch to a faster convolve implementation. This makes AddImpulseResponse significantly faster.
  • Add a rollover parameter to Shift. This allows for introducing silence instead of a wrapped part of the sound.
  • Expand supported range of librosa versions
  • Add support for flac in AddImpulseResponse
  • Implement AddBackgroundNoise transform. Useful for when you want to add background noise to all of your sound. You need to give it a folder of background noises to choose from.
  • Implement AddShortNoises. Useful for when you want to add (bursts of) short noise sounds to your input audio.
  • Improve handling of empty input

v0.8.0 (2020-01-28)

  • Add shuffle parameter in Composer
  • Add Resample transformation
  • Add ClippingDistortion transformation
  • Add SmoothFadeTimeMask as alternative to TimeMask

Thanks to askskro

v0.7.0 (2020-01-14)

Add new transforms:

  • AddImpulseResponse
  • FrequencyMask
  • TimeMask
  • AddGaussianSNR

Thanks to karpnv

v0.6.0 (2019-05-27)

  • Implement peak normalization

v0.5.0 (2019-02-23)

  • Implement Shift transform
  • Ensure p is within bounds

v0.4.0 (2019-02-19)

  • Implement PitchShift transform
  • Fix output dtype of AddGaussianNoise

v0.3.0 (2019-02-19)

Implement leave_length_unchanged in TimeStretch

v0.2.0 (2019-02-18)

  • Add TimeStretch transform
  • Parametrize AddGaussianNoise

v0.1.0 (2019-02-15)

Initial release. Includes only one transform: AddGaussianNoise

Development

Install the dependencies specified in requirements.txt

Code style

Format the code with black

Run tests and measure code coverage

pytest

Generate demo sounds for empirical evaluation

python -m demo.demo

Alternatives

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiomentations-0.10.1.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

audiomentations-0.10.1-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file audiomentations-0.10.1.tar.gz.

File metadata

  • Download URL: audiomentations-0.10.1.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for audiomentations-0.10.1.tar.gz
Algorithm Hash digest
SHA256 73b0b73aadee0f8ec0ec228a83cb9d79166d7e5fcf74493bcf3be0089e6cd013
MD5 987ecbda075d82d5cabe9dcf8a1629a8
BLAKE2b-256 135267b15061bed95409c163bf27326d6c1740be0d1026694e3e77b00bfa0fd8

See more details on using hashes here.

File details

Details for the file audiomentations-0.10.1-py3-none-any.whl.

File metadata

  • Download URL: audiomentations-0.10.1-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for audiomentations-0.10.1-py3-none-any.whl
Algorithm Hash digest
SHA256 623a4dffe92ebd8bd07384f50d10a20cd0db79e1a5230875c538b7bb2dcf4ea5
MD5 229d8215deabea99623b0d4b851400d4
BLAKE2b-256 312edc06d6fe0dedc561fbfe2f6ed3c3c68ac760db6a8daee05806515478234c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page