Skip to main content

Unified, inference-only toolkit for MT3 model family (Magenta MT3, MR-MT3, MT3-PyTorch, YourMT3)

Project description

MT3-Infer

Production-ready, unified inference toolkit for the MT3 music transcription model family

MT3-Infer provides a clean, framework-neutral API for running music transcription inference across multiple MT3 implementations with a single consistent interface.

Python 3.9+ PyTorch License: MIT PyPI


๐ŸŽ‰ What's New

  • v0.1.1 (Latest): Fixed YAML config files inclusion in package distribution
  • v0.1.0: Initial release with 3 production-ready models (MR-MT3, MT3-PyTorch, YourMT3)

Features

  • โœ… Unified API: One interface for all MT3 variants
  • โœ… Production Ready: Clean, tested, ~8MB package size
  • โœ… Auto-Download: Automatic checkpoint downloads on first use
  • โœ… 4 Download Methods: Auto, Python API, CLI, standalone script
  • โœ… 3 Models: MR-MT3, MT3-PyTorch, YourMT3
  • โœ… Framework Isolated: Clean PyTorch/TensorFlow/JAX separation
  • โœ… CLI Tool: mt3-infer command-line interface
  • โœ… Reproducible: Pinned dependencies, verified checkpoints

Quick Start

Installation

MT3-Infer is available on PyPI.

# Using pip
pip install mt3-infer

# Using UV (recommended for development)
uv pip install mt3-infer

Simple Transcription (One Line!)

from mt3_infer import transcribe

# Transcribe audio to MIDI (auto-downloads checkpoint on first use)
midi = transcribe(audio, sr=16000)
midi.save("output.mid")

Model Selection

# Use MR-MT3 model (57x real-time)
midi = transcribe(audio, model="mr_mt3")

# Use MT3-PyTorch model (147 notes detected)
midi = transcribe(audio, model="mt3_pytorch")

# Use YourMT3 model (multi-stem separation)
midi = transcribe(audio, model="yourmt3")

Download Checkpoints

# Download all models at once (874MB total)
mt3-infer download --all

# Download specific models
mt3-infer download mr_mt3 mt3_pytorch

# List available models
mt3-infer list

# Transcribe audio via CLI
mt3-infer transcribe input.wav -o output.mid -m mr_mt3

Heads up: The downloader now pulls MR-MT3 weights directly from gudgud1014/MR-MT3, so you no longer need Git LFS for that model. Checkpoints are stored under .mt3_checkpoints/<model> and will be re-created automatically if you delete the directory.

Set MT3_CHECKPOINT_DIR to store checkpoints somewhere else (e.g., shared storage) before running downloads or inference:

export MT3_CHECKPOINT_DIR=/data/models/mt3

Or use .env files (requires python-dotenv):

MT3_CHECKPOINT_DIR=/data/models/mt3

When the variable is set, both the Python API and CLI (including mt3-infer download) will read/write checkpoints inside that directory, preserving the same per-model layout as .mt3_checkpoints/.


Supported Models

Model Framework Speed Notes Detected Size Features
MR-MT3 PyTorch 57x real-time 116 notes 176 MB Optimized for speed
MT3-PyTorch PyTorch 12x real-time 147 notes 176 MB Official architecture with auto-filtering*
YourMT3 PyTorch + Lightning ~15x real-time 118 notes 536 MB 8-stem separation, Perceiver-TF + MoE

*MT3-PyTorch includes automatic instrument leakage filtering (configurable via auto_filter parameter)

Performance benchmarks from NVIDIA RTX 4090 with PyTorch 2.7.1 + CUDA 12.6

Default yourmt3 downloads the YPTF.MoE+Multi (noPS) checkpoint, matching the original YourMT3 Space output.


Advanced Usage

Explicit Model Loading

from mt3_infer import load_model

# Load model explicitly (cached for reuse)
model = load_model("mt3_pytorch", device="cuda")
midi = model.transcribe(audio, sr=16000)

Explore Available Models

from mt3_infer import list_models, get_model_info

# List all models
models = list_models()
for name, info in models.items():
    print(f"{name}: {info['description']}")

# Get model details
info = get_model_info("mr_mt3")
print(f"Speed: {info['metadata']['performance']['speed_x_realtime']}x real-time")

Disable Auto-Download

from mt3_infer import load_model

# Raise error if checkpoint not found (don't auto-download)
model = load_model("mr_mt3", auto_download=False)

Control MT3-PyTorch Instrument Filtering

MT3-PyTorch has automatic filtering to fix instrument leakage in drum tracks:

# Default: filtering enabled (recommended)
model = load_model("mt3_pytorch")

# Disable filtering to see raw model output
model = load_model("mt3_pytorch", auto_filter=False)

Override Checkpoint Directory

Use a shared storage location (e.g., NAS, cache volume) without changing your code:

export MT3_CHECKPOINT_DIR=/mnt/shared/mt3
uv run python -c "from mt3_infer import download_model; download_model('yourmt3')"
uv run mt3-infer download --all

To confirm the resolved location programmatically:

from mt3_infer import download_model
path = download_model('mt3_pytorch')
print(path)

Download Programmatically

from mt3_infer import download_model

# Pre-download checkpoints before inference
download_model("mr_mt3")
download_model("mt3_pytorch")
download_model("yourmt3")

Diagnostics & Troubleshooting

Extra smoke tests and tooling live in examples/diagnostics/:

  • download_mt3_pytorch.py โ€“ manual vs. automatic checkpoint download walkthrough
  • test_all_models.py โ€“ Loads all registered models and runs a short transcription
  • test_checkpoint_download.py โ€“ Verifies checkpoints land in MT3_CHECKPOINT_DIR
  • test_yourmt3.py โ€“ Full audio-to-MIDI flow for the YourMT3 MoE model

Run them via uv run python examples/diagnostics/<script>.py after setting any needed environment variables.


Installation Options

Basic Installation

pip install mt3-infer

Development Installation

# Clone repository
git clone https://github.com/openmirlab/mt3-infer.git
cd mt3-infer

# Install with UV (recommended)
uv sync --extra torch --extra dev

# Or with pip
pip install -e ".[torch,dev]"

Optional Dependencies

# PyTorch backend (default)
pip install mt3-infer[torch]

# TensorFlow backend
pip install mt3-infer[tensorflow]

# All backends
pip install mt3-infer[all]

# Development tools
pip install mt3-infer[dev]

# MIDI synthesis (optional)
pip install mt3-infer[synthesis]

CLI Tool

The mt3-infer CLI provides convenient access to all functionality:

# Download checkpoints
mt3-infer download --all                    # Download all models
mt3-infer download mr_mt3 mt3_pytorch       # Download specific models

# List available models
mt3-infer list

# Transcribe audio
mt3-infer transcribe input.wav -o output.mid
mt3-infer transcribe input.wav -m mr_mt3    # Use MR-MT3 model
mt3-infer transcribe input.wav --device cuda # Use GPU

# Show help
mt3-infer --help
mt3-infer download --help

Download Methods

MT3-Infer supports 4 flexible download methods:

1. Automatic Download (Default)

Checkpoints download automatically on first use:

midi = transcribe(audio)  # Auto-downloads if needed

2. Python API

Pre-download programmatically:

from mt3_infer import download_model
download_model("mr_mt3")

3. CLI

Download via command line:

mt3-infer download --all

4. Standalone Script

Batch download without installing package:

python tools/download_all_checkpoints.py

See the CLI section above for detailed download instructions.


Project Status

Current Version: 0.1.2 (Production Ready!)

โœ… Completed Features

  • โœ… Core infrastructure (MT3Base interface, utilities)
  • โœ… 3 production adapters (MR-MT3, MT3-PyTorch, YourMT3)
  • โœ… Public API (transcribe(), load_model())
  • โœ… Model registry with aliases
  • โœ… Checkpoint download system (4 methods)
  • โœ… CLI tool (mt3-infer)
  • โœ… Production cleanup (~8MB package)
  • โœ… Comprehensive documentation

๐Ÿ“ฆ Package Statistics

  • Source code: ~5 MB
  • Vendor dependencies: ~3 MB
  • Documentation: 284 KB
  • Total (source only): ~8 MB
  • With downloaded models: ~882 MB

๐Ÿšง Roadmap

  • v0.2.0 (Planned): Batch processing, additional optimizations
  • v0.3.0 (Planned): ONNX export, streaming inference
  • v1.0.0 (Planned): Full test coverage, additional features

Note: Magenta MT3 (JAX/Flax) has been excluded due to dependency conflicts with the PyTorch ecosystem. The current 3 models (MR-MT3, MT3-PyTorch, YourMT3) provide comprehensive coverage for various transcription scenarios.


Architecture

mt3_infer/
โ”œโ”€โ”€ __init__.py          # Public API
โ”œโ”€โ”€ api.py               # High-level functions (transcribe, load_model)
โ”œโ”€โ”€ base.py              # MT3Base abstract interface
โ”œโ”€โ”€ cli.py               # CLI tool
โ”œโ”€โ”€ exceptions.py        # Custom exceptions
โ”œโ”€โ”€ adapters/            # Model-specific implementations
โ”‚   โ”œโ”€โ”€ mr_mt3.py        # MR-MT3 adapter
โ”‚   โ”œโ”€โ”€ mt3_pytorch.py   # MT3-PyTorch adapter
โ”‚   โ”œโ”€โ”€ yourmt3.py       # YourMT3 adapter
โ”‚   โ””โ”€โ”€ vocab_utils.py   # Shared MIDI decoding
โ”œโ”€โ”€ config/
โ”‚   โ””โ”€โ”€ checkpoints.yaml # Model registry & download config
โ”œโ”€โ”€ utils/
โ”‚   โ”œโ”€โ”€ audio.py         # Audio preprocessing
โ”‚   โ”œโ”€โ”€ midi.py          # MIDI postprocessing
โ”‚   โ”œโ”€โ”€ download.py      # Checkpoint download system
โ”‚   โ””โ”€โ”€ framework.py     # Version checks
โ””โ”€โ”€ models/              # Model implementations
    โ”œโ”€โ”€ mr_mt3/          # MR-MT3 model code
    โ”œโ”€โ”€ mt3_pytorch/     # MT3-PyTorch model code
    โ””โ”€โ”€ yourmt3/         # YourMT3 model code

Documentation

For Users

For Developers


Development

Setup

# Install dependencies
uv sync --extra torch --extra dev

# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov=mt3_infer --cov-report=html

# Linting
uv run ruff check .
uv run ruff check --fix .

# Type checking
uv run mypy mt3_infer/

Using UV

This project uses UV for dependency management. Always use uv run:

# Correct
uv run python script.py
uv run pytest

# Incorrect
python script.py
pytest

See docs/dev/PRINCIPLES.md for development guidelines.


Integration with worzpro-demo

To use mt3-infer in the worzpro-demo project:

# In worzpro-demo/pyproject.toml
[tool.uv.sources]
mt3-infer = { git = "https://github.com/openmirlab/mt3-infer", extras = ["torch"] }

Then in Python:

from mt3_infer import transcribe
midi = transcribe(audio, sr=16000)

Examples

See the examples/ directory for complete examples:


License

MIT License - see LICENSE for details.

This project includes code adapted from:

  • Magenta MT3 (Apache-2.0) - Google Research
  • MR-MT3 (MIT) - Hao Hao Tan et al.
  • MT3-PyTorch - Kunato's PyTorch port
  • YourMT3 (Apache-2.0) - Minz Won et al.

See mt3_infer/config/checkpoints.yaml for full provenance.


Contributing

We welcome contributions! Please:

  1. Read docs/dev/SPEC.md for API specifications
  2. Follow docs/dev/PRINCIPLES.md for development guidelines
  3. Submit PRs with tests and documentation

Citation

If you use MT3-Infer in your research, please cite the original MT3 papers:

@inproceedings{hawthorne2022mt3,
  title={Multi-Task Multitrack Music Transcription},
  author={Hawthorne, Curtis and others},
  booktitle={ISMIR},
  year={2022}
}

Support

For issues and questions:


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mt3_infer-0.1.3.tar.gz (213.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mt3_infer-0.1.3-py3-none-any.whl (281.4 kB view details)

Uploaded Python 3

File details

Details for the file mt3_infer-0.1.3.tar.gz.

File metadata

  • Download URL: mt3_infer-0.1.3.tar.gz
  • Upload date:
  • Size: 213.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mt3_infer-0.1.3.tar.gz
Algorithm Hash digest
SHA256 9677f91a8714444ec3f87a629008f33fcfa64fabccaae49322b4b53935dab036
MD5 4aaa2ef75d9831e54e9a51ba6892b4b9
BLAKE2b-256 f6f38f9bc047941ca2aa5736208b874ce278fd1d88e23c7ad9cf476b207a0c09

See more details on using hashes here.

Provenance

The following attestation bundles were made for mt3_infer-0.1.3.tar.gz:

Publisher: publish.yml on openmirlab/mt3-infer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mt3_infer-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: mt3_infer-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 281.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mt3_infer-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 974e1d5a4809c63079fa7e2c699add3671bb2970ba1ff3ea138923e2a8922b78
MD5 df523f6007f37972af9f2187026b23e5
BLAKE2b-256 ff18b4c84a00f2a18969f2c42c9c0e30f5918ad799194f0c2fda3c31ca68a9f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for mt3_infer-0.1.3-py3-none-any.whl:

Publisher: publish.yml on openmirlab/mt3-infer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page