Skip to main content

An extensible data augmentation package for creating complex transformation pipelines for audio signals.

Project description


An extensible data augmentation package for creating complex transformation pipelines for audio signals.

What is data augmentation?

Data augmentation is the creation of artificial data from original data by typically applying a transformation, or multiple transformations, to the original data. It is a common method for improving the versatility of machine learning models, in addition to providing more training examples for datasets of limited size.

In image data for example, it is common to use horizontal and vertical flipping, random cropping, zooming and additive noise for augmentation. In audio, we can use other transformations such as pitch shifting, time stretching or fading the signal in or out. Some image augmentation methods such as additive noise can also be transferred over to audio data.

Supported augmentation methods

Sigment currently provides the following augmentation methods for both mono and stereo signals. More information about each can be found in the documentation:

  • <input type="checkbox" checked="" disabled="" /> Additive Gaussian White Noise
  • <input type="checkbox" checked="" disabled="" /> Time Stretching and Pitch Shifting
  • <input type="checkbox" checked="" disabled="" /> Edge Cropping and Random Cropping
  • <input type="checkbox" checked="" disabled="" /> Linear Fading In/Out
  • <input type="checkbox" checked="" disabled="" /> Normalization, Pre-Emphasis and Loudest Section Extraction
  • <input type="checkbox" checked="" disabled="" /> Median Filtering
  • <input type="checkbox" checked="" disabled="" /> Clipping Distortion and Reverb

It is also possible to design custom augmentation methods using a simple Transform base class provided by Sigment.


Suppose we have the following stereo signal from audio.wav:


We can apply a pipeline of transformations to the signal to produce multiple augmented copies of it:


Click here to see the code for the augmentation pipeline that produces these signals!

from librosa import load
from sigment import *

# Load the stereo WAV audio file
X, sr = load('audio.wav', mono=False)

# Create a complex augmentation pipeline
transform = Pipeline([
    GaussianWhiteNoise(scale=(0.001, 0.0075), p=0.65),
    ExtractLoudestSection(duration=(0.85, 0.95)),
        RandomCrop(crop_size=(0.01, 0.04), n_crops=(2, 5)),
            EdgeCrop('start', crop_size=(0.05, 0.1)),
            EdgeCrop('end', crop_size=(0.05, 0.1))
        ], n=(1, 2))
            LinearFade('in', fade_size=(0.1, 0.2)),
            LinearFade('out', fade_size=(0.1, 0.2))
        ], n=(1, 2))
    ], p=0.5),
    TimeStretch(rate=(0.8, 1.2)),
    PitchShift(n_steps=(-0.25, 0.25)),
    MedianFilter(window_size=(5, 10), p=0.5)

# Generate 25 augmentations of the signal X
transform.generate(X, n=25, sr=sr)

Note: The full code for this example can be found in the notebook here.


To install Sigment from PyPI, you can use pip:

pip install sigment


Sigment provides two main components that can be used to construct augmentation pipelines:

  • Transforms (sigment.transforms): Used to apply a specific type of transformation to the audio data.

  • Quantifiers (sigment.quantifiers): Used to specify rules for how a sequence of transformations or nested quantifiers should be applied to augment the audio data.

Read the documentation and example notebooks for more information about the usage of both.


Sigment offers a familiar interface for transformations, taking inspiration from some other well-written augmentation libraries. Without the following libraries, the capabilities of Sigment would be very limited:


All contributions to this repository are greatly appreciated. Contribution guidelines can be found here.

Edwin Onuonga
Edwin Onuonga

✉️ 🌍

Sigment © 2019-2021, Edwin Onuonga - Released under the MIT License.
Authored and maintained by Edwin Onuonga.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for sigment, version 0.1.1
Filename, size File type Python version Upload date Hashes
Filename, size sigment-0.1.1.tar.gz (16.5 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page