Skip to main content

The On-the-fly MIDI Data Augmentation Library!

Project description

MIDIOgre

GitHub stars GitHub forks GitHub license Python 3.8+

MIDIOgre is a powerful Python library for performing data augmentations on MIDI inputs, primarily designed for machine learning models operating on symbolic music data. With MIDIOgre, you can easily generate variations of MIDI sequences to enrich your training data and improve the robustness and generalization of your models.

While inspired by the functionalities of existing libraries like mdtk and miditok, MIDIOgre offers on-the-fly augmentation similar to albumentation and audiomentation, generating randomly modified MIDI data directly in RAM to enable extensive augmentation with minimal memory overhead.

A plot of implemented MIDIOgre augmentations.

Features

  • Comprehensive MIDI Augmentations: A wide range of transformations including pitch shifting, onset time modification, duration changes, and more
  • Easy Integration: API design follows PyTorch augmentation scheme to integrate seamlessly with machine learning workflows
  • Customizable: Flexible parameters for fine-tuning augmentations to your needs
  • Efficient: Optimized for handling large MIDI datasets

Installation

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Install from PyPI

pip install midiogre

The following scenarios will require the development version of pretty-midi from GitHub:

  • When using TempoShift followed by other MIDIOgre augmentations in a Compose pipeline
  • This is because TempoShift returns a mido.MidiFile object that needs to be converted back to a pretty_midi.PrettyMIDI object

If you need this functionality, install the development version of pretty-midi:

pip install "pretty-midi @ git+https://github.com/craffel/pretty-midi"

If you encounter any installation issues, try upgrading pip first:

pip install --upgrade pip

Install from source (for development)

# Clone the repository
git clone https://github.com/a-pillay/MIDIOgre.git
cd MIDIOgre

# Create and activate a virtual environment (optional but recommended)
python -m venv .venv
source .venv/bin/activate  # On Windows, use `.venv\Scripts\activate`

# Install in editable mode with development dependencies
pip install -e ".[dev]"

Documentation

The complete documentation for MIDIOgre is available online at https://a-pillay.github.io/MIDIOgre/. The documentation includes:

  • Detailed API reference
  • Usage examples
  • Tutorials
  • Best practices
  • Development guidelines

Quick Start

from midiogre.augmentations import PitchShift, OnsetTimeShift, NoteDelete
from midiogre.core import Compose
import pretty_midi

# Basic usage - single file augmentation
midi_data = pretty_midi.PrettyMIDI('input.mid')
transform = Compose([
    PitchShift(max_shift=3, mode='both', p=0.8),
    OnsetTimeShift(max_shift=0.1, mode='both', p=0.5)
])
augmented = transform(midi_data)
augmented.write('output.mid')

# Integration with ML pipelines
class MIDIDataset(torch.utils.data.Dataset):
    def __getitem__(self, idx):
        # Your MIDI loading logic here
        midi_data = load_midi(idx)
        return self.transform(midi_data) if self.transform else midi_data

# Define augmentation pipeline for training
transform = Compose([
    PitchShift(max_shift=3, mode='both', p=0.8),      # Randomly transpose by ±3 semitones
    OnsetTimeShift(max_shift=0.1, mode='both', p=0.5), # Shift note timings by up to 100ms
    NoteDelete(p=0.3)                                  # Randomly remove up to 30% of notes
])

# Use in your training pipeline
train_dataset = MIDIDataset(transform=transform)
val_dataset = MIDIDataset(transform=None)  # No augmentation for validation

Available Augmentations

Currently Implemented

  • PitchShift: Transpose MIDI note values of selected instruments
  • OnsetTimeShift: Modify note onset times while preserving durations
  • DurationShift: Alter note durations while maintaining onset times
  • NoteDelete: Remove notes from instrument tracks
  • NoteAdd: Add new notes to instrument tracks
  • TempoShift: Modify the global tempo of MIDI files

Planned Features

  • NoteSplit: Split notes into multiple segments
  • VelocityShift: Modify MIDI note velocities
  • Swing-based augmentations
  • MIDI CC based augmentations
  • Semantically-meaningful augmentations (respecting rhythms & beats)

Development

Please note that this project is developed with the assistance of Cursor, mostly for runtime optimizations, unit-testing, documentation and build pipelines.

Setting up for development

# Install development dependencies
pip install -r requirements-dev.txt

# Install documentation dependencies (if working on docs)
pip install -r requirements-docs.txt

Running Tests

pytest tests/
# For coverage report
pytest --cov=midiogre tests/

Contributing

Contributions are welcome! Here's how you can help:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run the tests to ensure everything works
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Areas where we particularly welcome contributions:

  • Comprehensive unit tests
  • Documentation improvements
  • New augmentation techniques
  • Performance optimizations
  • Bug fixes

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use MIDIOgre in your research, please cite:

@software{midiogre2024,
  author = {Pillay, A},
  title = {MIDIOgre: MIDI Data Augmentation Library},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/a-pillay/MIDIOgre}
}

Acknowledgments

Contact

For questions, suggestions, or collaboration opportunities, please reach out via GitHub Issues.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

midiogre-0.1.3.dev0.tar.gz (945.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

midiogre-0.1.3.dev0-py3-none-any.whl (44.8 kB view details)

Uploaded Python 3

File details

Details for the file midiogre-0.1.3.dev0.tar.gz.

File metadata

  • Download URL: midiogre-0.1.3.dev0.tar.gz
  • Upload date:
  • Size: 945.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for midiogre-0.1.3.dev0.tar.gz
Algorithm Hash digest
SHA256 81e82753bf8f596d6c89c7eee682fbf8ccfe79d57ab1f1064171671b02271f4d
MD5 1b8d97928894586b322da5d79fbc1270
BLAKE2b-256 640bb18f382b7ac321b96308c2b5fbde271de9333b11e23bb54e0537f3de9afc

See more details on using hashes here.

Provenance

The following attestation bundles were made for midiogre-0.1.3.dev0.tar.gz:

Publisher: publish.yml on a-pillay/MIDIOgre

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file midiogre-0.1.3.dev0-py3-none-any.whl.

File metadata

  • Download URL: midiogre-0.1.3.dev0-py3-none-any.whl
  • Upload date:
  • Size: 44.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for midiogre-0.1.3.dev0-py3-none-any.whl
Algorithm Hash digest
SHA256 98e1221a9ff9f2522baa8f07639b4279b7480836e4bb381301c143569f117c1c
MD5 f5197f73a485839d507cbdc7362a614f
BLAKE2b-256 de636f178189d391435e470136be36c6527eb2a1a9e577b5b6890cd6e918ac7c

See more details on using hashes here.

Provenance

The following attestation bundles were made for midiogre-0.1.3.dev0-py3-none-any.whl:

Publisher: publish.yml on a-pillay/MIDIOgre

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page