Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Deep learning with PyTorch and audio

Project description

audtorch on TravisCI audtorch's documentation on Read the Docs audtorch's MIT license

Deep learning with PyTorch and audio.


If you are interested in PyTorch and audio you should also check out the efforts to integrate more audio directly into PyTorch:


audtorch is supported by Python 3.5 or higher. To install it run (preferable in a virtual environment):

pip install audtorch


audtorch automates the data iteration process for deep neural network training using PyTorch. It provides a set of feature extraction transforms that can be implemented on-the-fly on the CPU.

The following example creates a data set of speech samples that are cut to a fixed length of 10240 samples. In addition they are augmented on the fly during data loading by a transform that adds samples from another data set:

>>> import sounddevice as sd
>>> from audtorch import datasets, transforms
>>> noise = datasets.WhiteNoise(duration=10240, sampling_rate=16000)
>>> augment = transforms.Compose([transforms.RandomCrop(10240),
...                               transforms.RandomAdditiveMix(noise)])
>>> data = datasets.LibriSpeech(root='~/LibriSpeech', sets='dev-clean',
...                             download=True, transform=augment)
>>> signal, label = data[8]
>>>, data.sampling_rate)

Besides data sets and transforms the package provides standard evaluation metrics, samplers, and necessary collate functions for training deep neural networks for audio tasks.


All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Version 0.4.2 (2019-11-04)

  • Fixed: critical bug of missing files in wheel package (#60)

Version 0.4.1 (2019-10-25)

  • Fixed: default axis values for Masking transforms (#59)

Version 0.4.0 (2019-10-21)

  • Added: masking transforms in time and frequency domain

Version 0.3.2 (2019-10-04)

  • Fixed: long description in setup.cfg

Version 0.3.1 (2019-10-04)

  • Changed: define package in setup.cfg

Version 0.3.0 (2019-09-13)

  • Added: datasets.SpeechCommands (#49)
  • Removed: LogSpectrogram (#52)

Version 0.2.1 (2019-08-01)

  • Changed: Remove os.system call for moving files (#43)
  • Fixed: Remove broken logos from issue templates (#31)
  • Fixed: Wrong Spectrogram output shape in documentation (#40)
  • Fixed: Broken data set loading for relative paths (#33)

Version 0.2.0 (2019-06-28)

  • Added: Standardize, Log (#29)
  • Changed: Switch to Keep a Changelog format (#34)
  • Deprecated: LogSpectrogram (#29)
  • Fixed: normalize axis (#28)

Version 0.1.1 (2019-05-23)

  • Fixed: Broken API documentation on RTD

Version 0.1.0 (2019-05-22)

  • Added: Public release

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for audtorch, version 0.4.2
Filename, size File type Python version Upload date Hashes
Filename, size audtorch-0.4.2-py3-none-any.whl (52.9 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size audtorch-0.4.2.tar.gz (56.2 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page