Skip to main content

PyTorch-based audio source separation toolkit

Project description

Asteroid : Audio Source Separation on steroids

Build Status codecov

Slack

:construction: :warning: Under development :warning: :construction:

Asteroid is a Pytorch-based source separation and speech enhancement API that enables fast experimentation on common datasets. It comes with a source code written to support a large range of architectures and a set of recipes to reproduce some papers.
Asteroid is intended to be a community-based project so hop on and help us !

Guiding principles

  • User friendliness. Asteroid's API offers simple solutions for most common use cases.
  • Modularity. Building blocks are thought and designed to be seamlessly plugged together. Filterbanks, encoders, maskers, decoders and losses are all common building blocks that can be combined in a flexible way to create new systems.
  • Extensibility. Extending Asteroid with new features is simple. Add a new filterbank, separator, architecture, dataset or even recipe very easily.
  • Reproducibility. Recipes provide an easy way to reproduce results with data preparation, training and evaluation in a same script.

Highlights

Installation

In order to install Asteroid, clone the repo and install it using pip or python :

git clone https://github.com/mpariente/AsSteroid
cd AsSteroid
# Install with pip (in editable mode)
pip install -e .
# Install with python
python setup.py install

Running a recipe

cd egs/wham/ConvTasNet
./run.sh

More information in egs/README.md.

Recipes

Writing your own recipe

Contributing

See our contributing guidelines.

Codebase structure

├── asteroid                 # Python package / Source code
│   ├── data                 # Data classes, DalatLoaders maker.
│   ├── engine               # Training classes : losses, optimizers and trainer.
│   ├── filterbanks          # Common filterbanks and related classes.
│   ├── masknn               # Separation building blocks and architectures.
│   └── utils.py
├── examples                 # Simple asteroid examples 
└── egs                      # Recipes for all datasets and systems.
│   ├── wham                 # Recipes for one dataset (WHAM) 
│   │   ├── ConvTasNet       # ConvTasnet systme on the WHAM dataset.
│   │   │   └── ...          # Recipe's structure. See egs/README.md for more info
│   │   ├── Your recipe      # More recipes on the same dataset (Including yours)
│   │   ├── ...
│   │   └── DualPathRNN
│   └── Your dataset         # More datasets (Including yours)

Building the docs

To build the docs, you'll need Sphinx, a theme and some other package

# Start by installing the required packages
cd docs/
pip install -r requirements.txt
# Build the docs
make html
# View it ! (Change firefox by your favorite browser)
firefox build/html/index.html

If you rebuild the docs, don't forget to run make clean before it.

You can add this to your .bashrc, source it and run run_docs for the docs/ folder

alias run_docs='make clean; make html; firefox build/html/index.html'

Why Asteroid ?

Audio source separation and speech enhancement are fast evolving fields with a growing number of papers submitted to conferences each year. While datasets such as wsj0-{2, 3}mix, WHAM or MS-SNSD are being shared, there has been little effort to create common codebases for development and evaluation of source separation and speech enhancement algorithms. Here is one !

Remote TensorBoard visualization

# Launch tensorboard remotely (default port is 6006)
tensorboard --logdir exp/tmp/lightning_logs/ --port tf_port

# Open port-forwarding connection. Add -Nf option not to open remote. 
ssh -L local_port:localhost:tf_port user@ip

Then open http://localhost:local_port/.

Project details


Release history Release notifications

This version

0.0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for asteroid, version 0.0.1
Filename, size File type Python version Upload date Hashes
Filename, size asteroid-0.0.1-py3-none-any.whl (4.5 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size asteroid-0.0.1.tar.gz (4.1 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page