Skip to main content

Synthesis of MIDI with DDSP

Project description

logo

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Demos | Blog Post | Colab Notebook | Paper | Hugging Face Spaces

MIDI-DDSP is a hierarchical audio generation model for synthesizing MIDI expanded from DDSP.

Links

Install MIDI-DDSP

You could install MIDI-DDSP via pip, which allows you to use the cool Command-line MIDI synthesis to synthesize your MIDI.

To install MIDI-DDSP via pip, simply run:

pip install midi-ddsp

Train MIDI-DDSP

To train MIDI-DDSP, please first install midi-ddsp and clone the MIDI-DDSP repository:

git clone https://github.com/magenta/midi-ddsp.git

For dataset, please download the tfrecord files for the URMP dataset in here to the data folder in your cloned repository using the following commands:

cd midi-ddsp # enter the project directory
mkdir ./data # create a data folder
gsutil cp gs://magentadata/datasets/urmp/urmp_20210324/* ./data # download tfrecords to directory

Please check here for how to install and use gsutil.

Finally, you can run the script train_midi_ddsp.sh to train the exact same model we used in the paper:

sh ./train_midi_ddsp.sh

The current codebase does not support training with arbitrary dataset, but we will hopefully update that in the near future.

Side note:

If one download the dataset to a different location, please change the data_dir parameter in train_midi_ddsp.sh.

The training of MIDI-DDSP takes approximately 18 hours on a single RTX 8000. The training code for now does not support multi-GPU training. We recommend using a GPU with more than 24G of memory when training Synthesis Generator in batch size of 16. For a GPU with less memory, please consider using a smaller batch size and change the batch size in train_midi_ddsp.sh.

Try to play with MIDI-DDSP yourself!

Please try out MIDI-DDSP in Colab notebooks!

In this notebook, you will try to use MIDI-DDSP to synthesis a monophonic MIDI file, adjust note expressions, make pitch bend by adjusting synthesis parameters, and synthesize quartet from Bach chorales.

We have trained MIDI-DDSP on the URMP dataset which support synthesizing 13 instruments: violin, viola, cello, double bass, flute, oboe, clarinet, saxophone, bassoon, trumpet, horn, trombone, tuba. You could find how to download and use our pre-trained model below:

Command-line MIDI synthesis

On can use the MIDI-DDSP as a command-line MIDI synthesizer just like FluidSynth.

To use command-line synthesis to synthesize a midi file, please first download the model weights by running:

midi_ddsp_download_model_weights

To synthesize a midi file simply run the following command:

midi_ddsp_synthesize --midi_path <path-to-midi>

For a starter, you can try to synthesize the example midi file in this repository:

midi_ddsp_synthesize --midi_path ./midi_example/ode_to_joy.mid

The command line also enables synthesize a folder of midi files. For more advance use (synthesize a folder, using FluidSynth for instruments not supported, etc.), please see synthesize_midi.py --help.

If you have a trouble downloading the model weights, please manually download from here, and specify the synthesis_generator_weight_path and expression_generator_weight_path by yourself when using the command line. You can also specify your other model weights if you want to use your own trained model.

Python Usage

After installing midi-ddsp, you could import midi-ddsp in python and synthesize MIDI in your code.

Minimal Example

Here is a simple example to use MIDI-DDSP to synthesize a midi file:

from midi_ddsp import synthesize_midi, load_pretrained_model

midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize MIDI
output = synthesize_midi(synthesis_generator, expression_generator, midi_file)
# The synthesized audio
synthesized_audio = output['mix_audio']

Advance Usage

Here is an advance example to synthesize the ode_to_joy.mid, change the note expression controls, and adjust the synthesis parameters:

import numpy as np
import tensorflow as tf
from midi_ddsp.utils.midi_synthesis_utils import synthesize_mono_midi, conditioning_df_to_audio
from midi_ddsp.utils.inference_utils import get_process_group
from midi_ddsp.midi_ddsp_synthesize import load_pretrained_model
from midi_ddsp.data_handling.instrument_name_utils import INST_NAME_TO_ID_DICT

# -----MIDI Synthesis-----
midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize with violin:
instrument_name = 'violin'
instrument_id = INST_NAME_TO_ID_DICT[instrument_name]
# Run model prediction
midi_audio, midi_control_params, midi_synth_params, conditioning_df = synthesize_mono_midi(synthesis_generator,
                                                                                           expression_generator,
                                                                                           midi_file, instrument_id,
                                                                                           output_dir=None)

synthesized_audio = midi_audio  # The synthesized audio

# -----Adjust note expression controls and re-synthesize-----

# Make all notes weak vibrato:
conditioning_df_changed = conditioning_df.copy()
note_vibrato = conditioning_df_changed['vibrato_extend'].value
conditioning_df_changed['vibrato_extend'] = np.ones_like(conditioning_df['vibrato_extend'].values) * 0.1
# Re-synthesize
midi_audio_changed, midi_control_params_changed, midi_synth_params_changed = conditioning_df_to_audio(
  synthesis_generator, conditioning_df_changed, tf.constant([instrument_id]))

synthesized_audio_changed = midi_audio_changed  # The synthesized audio

# There are 6 note expression controls in conditioning_df that you could change:
# 'amplitude_mean', 'amplitude_std', 'vibrato_extend', 'brightness', 'attack_level', 'amplitudes_max_pos'.
# Please refer to https://colab.research.google.com/github/magenta/midi-ddsp/blob/main/midi_ddsp/colab/MIDI_DDSP_Demo.ipynb#scrollTo=XfPPrdPu5sSy for the effect of each control. 

# -----Adjust synthesis parameters and re-synthesize-----

# The original synthesis parameters:
f0_ori = midi_synth_params['f0_hz']
amps_ori = midi_synth_params['amplitudes']
noise_ori = midi_synth_params['noise_magnitudes']
hd_ori = midi_synth_params['harmonic_distribution']

# TODO: make your change of the synthesis parameters here:
f0_changed = f0_ori
amps_changed = amps_ori
noise_changed = noise_ori
hd_changed = hd_ori

# Resynthesis the audio using DDSP
processor_group = get_process_group(midi_synth_params['amplitudes'].shape[1], use_angular_cumsum=True)
midi_audio_changed = processor_group({'amplitudes': amps_changed,
                                      'harmonic_distribution': hd_changed,
                                      'noise_magnitudes': noise_changed,
                                      'f0_hz': f0_changed, },
                                     verbose=False)
midi_audio_changed = synthesis_generator.reverb_module(midi_audio_changed, reverb_number=instrument_id, training=False)

synthesized_audio_changed = midi_audio_changed  # The synthesized audio

Acknowledgment

This is not an officially supported Google product.

We would like to thank @akhaliq for creating the Hugging Face Spaces.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

midi-ddsp-0.2.0.tar.gz (47.6 kB view hashes)

Uploaded Source

Built Distribution

midi_ddsp-0.2.0-py3-none-any.whl (67.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page