Unified, inference-only toolkit for MT3 model family (Magenta MT3, MR-MT3, MT3-PyTorch, YourMT3)
Project description
MT3-Infer
Production-ready, unified inference toolkit for the MT3 music transcription model family
MT3-Infer provides a clean, framework-neutral API for running music transcription inference across multiple MT3 implementations with a single consistent interface.
๐ What's New
- v0.1.1 (Latest): Fixed YAML config files inclusion in package distribution
- v0.1.0: Initial release with 3 production-ready models (MR-MT3, MT3-PyTorch, YourMT3)
Features
- โ Unified API: One interface for all MT3 variants
- โ Production Ready: Clean, tested, ~8MB package size
- โ Auto-Download: Automatic checkpoint downloads on first use
- โ 4 Download Methods: Auto, Python API, CLI, standalone script
- โ 3 Models: MR-MT3, MT3-PyTorch, YourMT3
- โ Framework Isolated: Clean PyTorch/TensorFlow/JAX separation
- โ
CLI Tool:
mt3-infercommand-line interface - โ Reproducible: Pinned dependencies, verified checkpoints
Quick Start
Installation
MT3-Infer is available on PyPI.
# Using pip
pip install mt3-infer
# Using UV (recommended for development)
uv pip install mt3-infer
Simple Transcription (One Line!)
from mt3_infer import transcribe
# Transcribe audio to MIDI (auto-downloads checkpoint on first use)
midi = transcribe(audio, sr=16000)
midi.save("output.mid")
Model Selection
# Use MR-MT3 model (57x real-time)
midi = transcribe(audio, model="mr_mt3")
# Use MT3-PyTorch model (147 notes detected)
midi = transcribe(audio, model="mt3_pytorch")
# Use YourMT3 model (multi-stem separation)
midi = transcribe(audio, model="yourmt3")
Download Checkpoints
# Download all models at once (874MB total)
mt3-infer download --all
# Download specific models
mt3-infer download mr_mt3 mt3_pytorch
# List available models
mt3-infer list
# Transcribe audio via CLI
mt3-infer transcribe input.wav -o output.mid -m mr_mt3
Heads up: The downloader now pulls MR-MT3 weights directly from
gudgud1014/MR-MT3, so you no longer need Git LFS for that model. Checkpoints are stored under.mt3_checkpoints/<model>and will be re-created automatically if you delete the directory.
Set MT3_CHECKPOINT_DIR to store checkpoints somewhere else (e.g., shared storage) before running downloads or inference:
export MT3_CHECKPOINT_DIR=/data/models/mt3
Or use .env files (requires python-dotenv):
MT3_CHECKPOINT_DIR=/data/models/mt3
When the variable is set, both the Python API and CLI (including mt3-infer download) will read/write checkpoints inside that directory, preserving the same per-model layout as .mt3_checkpoints/.
Supported Models
| Model | Framework | Speed | Notes Detected | Size | Features |
|---|---|---|---|---|---|
| MR-MT3 | PyTorch | 57x real-time | 116 notes | 176 MB | Optimized for speed |
| MT3-PyTorch | PyTorch | 12x real-time | 147 notes | 176 MB | Official architecture with auto-filtering* |
| YourMT3 | PyTorch + Lightning | ~15x real-time | 118 notes | 536 MB | 8-stem separation, Perceiver-TF + MoE |
*MT3-PyTorch includes automatic instrument leakage filtering (configurable via auto_filter parameter)
Performance benchmarks from NVIDIA RTX 4090 with PyTorch 2.7.1 + CUDA 12.6
Default
yourmt3downloads theYPTF.MoE+Multi (noPS)checkpoint, matching the original YourMT3 Space output.
Advanced Usage
Explicit Model Loading
from mt3_infer import load_model
# Load model explicitly (cached for reuse)
model = load_model("mt3_pytorch", device="cuda")
midi = model.transcribe(audio, sr=16000)
Explore Available Models
from mt3_infer import list_models, get_model_info
# List all models
models = list_models()
for name, info in models.items():
print(f"{name}: {info['description']}")
# Get model details
info = get_model_info("mr_mt3")
print(f"Speed: {info['metadata']['performance']['speed_x_realtime']}x real-time")
Disable Auto-Download
from mt3_infer import load_model
# Raise error if checkpoint not found (don't auto-download)
model = load_model("mr_mt3", auto_download=False)
Control MT3-PyTorch Instrument Filtering
MT3-PyTorch has automatic filtering to fix instrument leakage in drum tracks:
# Default: filtering enabled (recommended)
model = load_model("mt3_pytorch")
# Disable filtering to see raw model output
model = load_model("mt3_pytorch", auto_filter=False)
Override Checkpoint Directory
Use a shared storage location (e.g., NAS, cache volume) without changing your code:
export MT3_CHECKPOINT_DIR=/mnt/shared/mt3
uv run python -c "from mt3_infer import download_model; download_model('yourmt3')"
uv run mt3-infer download --all
To confirm the resolved location programmatically:
from mt3_infer import download_model
path = download_model('mt3_pytorch')
print(path)
Download Programmatically
from mt3_infer import download_model
# Pre-download checkpoints before inference
download_model("mr_mt3")
download_model("mt3_pytorch")
download_model("yourmt3")
Diagnostics & Troubleshooting
Extra smoke tests and tooling live in examples/diagnostics/:
download_mt3_pytorch.pyโ manual vs. automatic checkpoint download walkthroughtest_all_models.pyโ Loads all registered models and runs a short transcriptiontest_checkpoint_download.pyโ Verifies checkpoints land inMT3_CHECKPOINT_DIRtest_yourmt3.pyโ Full audio-to-MIDI flow for the YourMT3 MoE model
Run them via uv run python examples/diagnostics/<script>.py after setting any needed environment variables.
Installation Options
Basic Installation
pip install mt3-infer
Development Installation
# Clone repository
git clone https://github.com/openmirlab/mt3-infer.git
cd mt3-infer
# Install with UV (recommended)
uv sync --extra torch --extra dev
# Or with pip
pip install -e ".[torch,dev]"
Optional Dependencies
# PyTorch backend (default)
pip install mt3-infer[torch]
# TensorFlow backend
pip install mt3-infer[tensorflow]
# All backends
pip install mt3-infer[all]
# Development tools
pip install mt3-infer[dev]
# MIDI synthesis (optional)
pip install mt3-infer[synthesis]
CLI Tool
The mt3-infer CLI provides convenient access to all functionality:
# Download checkpoints
mt3-infer download --all # Download all models
mt3-infer download mr_mt3 mt3_pytorch # Download specific models
# List available models
mt3-infer list
# Transcribe audio
mt3-infer transcribe input.wav -o output.mid
mt3-infer transcribe input.wav -m mr_mt3 # Use MR-MT3 model
mt3-infer transcribe input.wav --device cuda # Use GPU
# Show help
mt3-infer --help
mt3-infer download --help
Download Methods
MT3-Infer supports 4 flexible download methods:
1. Automatic Download (Default)
Checkpoints download automatically on first use:
midi = transcribe(audio) # Auto-downloads if needed
2. Python API
Pre-download programmatically:
from mt3_infer import download_model
download_model("mr_mt3")
3. CLI
Download via command line:
mt3-infer download --all
4. Standalone Script
Batch download without installing package:
python tools/download_all_checkpoints.py
See the CLI section above for detailed download instructions.
Project Status
Current Version: 0.1.2 (Production Ready!)
โ Completed Features
- โ Core infrastructure (MT3Base interface, utilities)
- โ 3 production adapters (MR-MT3, MT3-PyTorch, YourMT3)
- โ
Public API (
transcribe(),load_model()) - โ Model registry with aliases
- โ Checkpoint download system (4 methods)
- โ
CLI tool (
mt3-infer) - โ Production cleanup (~8MB package)
- โ Comprehensive documentation
๐ฆ Package Statistics
- Source code: ~5 MB
- Vendor dependencies: ~3 MB
- Documentation: 284 KB
- Total (source only): ~8 MB
- With downloaded models: ~882 MB
๐ง Roadmap
- v0.2.0 (Planned): Batch processing, additional optimizations
- v0.3.0 (Planned): ONNX export, streaming inference
- v1.0.0 (Planned): Full test coverage, additional features
Note: Magenta MT3 (JAX/Flax) has been excluded due to dependency conflicts with the PyTorch ecosystem. The current 3 models (MR-MT3, MT3-PyTorch, YourMT3) provide comprehensive coverage for various transcription scenarios.
Architecture
mt3_infer/
โโโ __init__.py # Public API
โโโ api.py # High-level functions (transcribe, load_model)
โโโ base.py # MT3Base abstract interface
โโโ cli.py # CLI tool
โโโ exceptions.py # Custom exceptions
โโโ adapters/ # Model-specific implementations
โ โโโ mr_mt3.py # MR-MT3 adapter
โ โโโ mt3_pytorch.py # MT3-PyTorch adapter
โ โโโ yourmt3.py # YourMT3 adapter
โ โโโ vocab_utils.py # Shared MIDI decoding
โโโ config/
โ โโโ checkpoints.yaml # Model registry & download config
โโโ utils/
โ โโโ audio.py # Audio preprocessing
โ โโโ midi.py # MIDI postprocessing
โ โโโ download.py # Checkpoint download system
โ โโโ framework.py # Version checks
โโโ models/ # Model implementations
โโโ mr_mt3/ # MR-MT3 model code
โโโ mt3_pytorch/ # MT3-PyTorch model code
โโโ yourmt3/ # YourMT3 model code
Documentation
For Users
- Main README - This file
- Examples - Usage examples
- Troubleshooting - Common issues and solutions
- Benchmarks - Performance benchmarks
For Developers
- Documentation Index - Complete docs navigation
- API Specification - Formal API spec
- Design Principles - Development guidelines
- Download Guide - Internal download documentation
Development
Setup
# Install dependencies
uv sync --extra torch --extra dev
# Run tests
uv run pytest
# Run with coverage
uv run pytest --cov=mt3_infer --cov-report=html
# Linting
uv run ruff check .
uv run ruff check --fix .
# Type checking
uv run mypy mt3_infer/
Using UV
This project uses UV for dependency management. Always use uv run:
# Correct
uv run python script.py
uv run pytest
# Incorrect
python script.py
pytest
See docs/dev/PRINCIPLES.md for development guidelines.
Integration with worzpro-demo
To use mt3-infer in the worzpro-demo project:
# In worzpro-demo/pyproject.toml
[tool.uv.sources]
mt3-infer = { git = "https://github.com/openmirlab/mt3-infer", extras = ["torch"] }
Then in Python:
from mt3_infer import transcribe
midi = transcribe(audio, sr=16000)
Examples
See the examples/ directory for complete examples:
- public_api_demo.py - Main usage example
- synthesize_all_models.py - Compare all models
- demo_midi_synthesis.py - MIDI synthesis demo
- test_download.py - Download validation
- compare_models.py - Model comparison
License
MIT License - see LICENSE for details.
This project includes code adapted from:
- Magenta MT3 (Apache-2.0) - Google Research
- MR-MT3 (MIT) - Hao Hao Tan et al.
- MT3-PyTorch - Kunato's PyTorch port
- YourMT3 (Apache-2.0) - Minz Won et al.
See mt3_infer/config/checkpoints.yaml for full provenance.
Contributing
We welcome contributions! Please:
- Read docs/dev/SPEC.md for API specifications
- Follow docs/dev/PRINCIPLES.md for development guidelines
- Submit PRs with tests and documentation
Citation
If you use MT3-Infer in your research, please cite the original MT3 papers:
@inproceedings{hawthorne2022mt3,
title={Multi-Task Multitrack Music Transcription},
author={Hawthorne, Curtis and others},
booktitle={ISMIR},
year={2022}
}
Support
For issues and questions:
- GitHub Issues: github.com/openmirlab/mt3-infer/issues
- Documentation: docs/
- Examples: examples/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mt3_infer-0.1.3.tar.gz.
File metadata
- Download URL: mt3_infer-0.1.3.tar.gz
- Upload date:
- Size: 213.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9677f91a8714444ec3f87a629008f33fcfa64fabccaae49322b4b53935dab036
|
|
| MD5 |
4aaa2ef75d9831e54e9a51ba6892b4b9
|
|
| BLAKE2b-256 |
f6f38f9bc047941ca2aa5736208b874ce278fd1d88e23c7ad9cf476b207a0c09
|
Provenance
The following attestation bundles were made for mt3_infer-0.1.3.tar.gz:
Publisher:
publish.yml on openmirlab/mt3-infer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mt3_infer-0.1.3.tar.gz -
Subject digest:
9677f91a8714444ec3f87a629008f33fcfa64fabccaae49322b4b53935dab036 - Sigstore transparency entry: 820902998
- Sigstore integration time:
-
Permalink:
openmirlab/mt3-infer@d7cce04f72576572631705a62f86d99299c2151b -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/openmirlab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d7cce04f72576572631705a62f86d99299c2151b -
Trigger Event:
release
-
Statement type:
File details
Details for the file mt3_infer-0.1.3-py3-none-any.whl.
File metadata
- Download URL: mt3_infer-0.1.3-py3-none-any.whl
- Upload date:
- Size: 281.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
974e1d5a4809c63079fa7e2c699add3671bb2970ba1ff3ea138923e2a8922b78
|
|
| MD5 |
df523f6007f37972af9f2187026b23e5
|
|
| BLAKE2b-256 |
ff18b4c84a00f2a18969f2c42c9c0e30f5918ad799194f0c2fda3c31ca68a9f5
|
Provenance
The following attestation bundles were made for mt3_infer-0.1.3-py3-none-any.whl:
Publisher:
publish.yml on openmirlab/mt3-infer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mt3_infer-0.1.3-py3-none-any.whl -
Subject digest:
974e1d5a4809c63079fa7e2c699add3671bb2970ba1ff3ea138923e2a8922b78 - Sigstore transparency entry: 820903001
- Sigstore integration time:
-
Permalink:
openmirlab/mt3-infer@d7cce04f72576572631705a62f86d99299c2151b -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/openmirlab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d7cce04f72576572631705a62f86d99299c2151b -
Trigger Event:
release
-
Statement type: