Inference-only Mel-Band Roformer vocal separation toolkit
Project description
MelBand-RoFormer-Infer
Production-ready, inference-only toolkit for Mel-Band RoFormer audio source separation
MelBand-RoFormer-Infer provides a clean, lightweight API for running music source separation inference using Mel-Band RoFormer models with automatic checkpoint management.
Features
- Inference Only: Lightweight package focused on production inference
- Auto-Download: Automatic checkpoint downloads with integrity verification
- 70+ Pre-trained Models: Vocals, instrumentals, karaoke, denoise, dereverb, and more
- CLI Tools:
melband-roformer-inferandmelband-roformer-downloadcommands - Python API: Clean programmatic interface
- Model Registry: Easy model discovery with search and category filtering
Quick Start
Installation
# Using pip
pip install melband-roformer-infer
# Using UV (recommended)
uv pip install melband-roformer-infer
Download Models
# List available models
melband-roformer-download --list-models
# Download the recommended model (MelBand Roformer Kim)
melband-roformer-download --model melband-roformer-kim-vocals
# Download by category
melband-roformer-download --category karaoke --output-dir ./models
# Download all models
melband-roformer-download --all --output-dir ./models
CLI Inference
# Using the recommended MelBand Roformer Kim model
melband-roformer-infer \
--config_path models/melband-roformer-kim-vocals/config_vocals_mel_band_roformer.yaml \
--model_path models/melband-roformer-kim-vocals/MelBandRoformer.ckpt \
--input_folder ./songs \
--store_dir ./outputs
Every WAV inside input_folder produces *_vocals.wav and *_instrumental.wav stems.
Python API
from pathlib import Path
from ml_collections import ConfigDict
import torch
import yaml
from mel_band_roformer import MODEL_REGISTRY, DEFAULT_MODEL, get_model_from_config
# Use the default recommended model (MelBand Roformer Kim)
entry = MODEL_REGISTRY.get(DEFAULT_MODEL)
# Load config and model
config = ConfigDict(yaml.safe_load(open(f"models/{entry.slug}/{entry.config}")))
model = get_model_from_config("mel_band_roformer", config)
model.load_state_dict(torch.load(f"models/{entry.slug}/{entry.checkpoint}", map_location="cpu"))
Recommended Model
MelBand Roformer Kim (melband-roformer-kim-vocals) by Kimberley Jensen is the recommended default model for vocal separation. It provides excellent quality and is the foundation for many fine-tuned variants.
from mel_band_roformer import DEFAULT_MODEL
print(DEFAULT_MODEL) # "melband-roformer-kim-vocals"
Available Models
| Model | Category | Description |
|---|---|---|
melband-roformer-kim-vocals |
vocals | Recommended - Original MelBand Roformer by Kimberley Jensen |
melband-roformer-big-beta6 |
vocals | Big Beta 6 by unwa |
roformer-model-melband-roformer-vocals-by-gabox |
vocals | Vocals by Gabox |
roformer-model-melband-roformer-instrumental-by-gabox |
instrumental | Instrumental by Gabox |
roformer-model-mel-roformer-karaoke-aufr33-viperx |
karaoke | Karaoke by aufr33/viperx |
roformer-model-mel-roformer-denoise-aufr33 |
denoise | Denoise by aufr33 |
roformer-model-melband-roformer-de-reverb-by-anvuew |
dereverb | De-Reverb by anvuew |
| ... | ... | See --list-models for 70+ models |
Categories: vocals, instrumental, karaoke, denoise, dereverb, crowd, general, aspiration
Registry Helpers
from mel_band_roformer import MODEL_REGISTRY
# List all categories
print(MODEL_REGISTRY.categories())
# List models by category
for model in MODEL_REGISTRY.list("vocals"):
print(model.name, model.checkpoint)
# Search models
results = MODEL_REGISTRY.search("karaoke")
for m in results:
print(m.slug)
# Pretty-print all models
print(MODEL_REGISTRY.as_table())
Development Installation
# Clone repository
git clone https://github.com/openmirlab/melband-roformer-infer.git
cd melband-roformer-infer
# Install with UV
uv sync
# Install with pip
pip install -e ".[dev]"
Acknowledgments
This project builds upon the excellent work of several open-source projects:
- Mel-Band-Roformer-Vocal-Model by Kimberley Jensen - Original model and training
- BS-RoFormer by Phil Wang (lucidrains) - PyTorch implementation of the RoFormer architecture
- python-audio-separator by Andrew Beveridge (nomadkaraoke) - Pre-trained checkpoints and model configurations
- Original Research - Wei-Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, and Yun-Ning Hung for the Band-Split RoPE Transformer paper
License
MIT License - see LICENSE for details.
This project includes code and configurations adapted from:
- BS-RoFormer (MIT) - Phil Wang
- python-audio-separator (MIT) - Andrew Beveridge
- Mel-Band-Roformer-Vocal-Model - Kimberley Jensen
Citation
If you use MelBand-RoFormer-Infer in your research, please cite the original paper:
@inproceedings{Lu2023MusicSS,
title = {Music Source Separation with Band-Split RoPE Transformer},
author = {Wei-Tsung Lu and Ju-Chiang Wang and Qiuqiang Kong and Yun-Ning Hung},
year = {2023},
url = {https://api.semanticscholar.org/CorpusID:261556702}
}
Support
For issues and questions:
- GitHub Issues: github.com/openmirlab/melband-roformer-infer/issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file melband_roformer_infer-0.1.0.tar.gz.
File metadata
- Download URL: melband_roformer_infer-0.1.0.tar.gz
- Upload date:
- Size: 20.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f569583c3a1a7e30622f59604b771ba688f00e16bd50e06cb6e7037d8475704
|
|
| MD5 |
51c856371ec4c796fbe9caaa33f956bb
|
|
| BLAKE2b-256 |
6f50fdee659a23892b9cdaf917ab02bcd8bb74f876b8fc966915a65c0a803339
|
Provenance
The following attestation bundles were made for melband_roformer_infer-0.1.0.tar.gz:
Publisher:
publish.yml on openmirlab/melband-roformer-infer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
melband_roformer_infer-0.1.0.tar.gz -
Subject digest:
0f569583c3a1a7e30622f59604b771ba688f00e16bd50e06cb6e7037d8475704 - Sigstore transparency entry: 729496314
- Sigstore integration time:
-
Permalink:
openmirlab/melband-roformer-infer@12a79563e52b5ae491d3ffe2ecc25d0a8ae7bc7b -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/openmirlab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@12a79563e52b5ae491d3ffe2ecc25d0a8ae7bc7b -
Trigger Event:
release
-
Statement type:
File details
Details for the file melband_roformer_infer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: melband_roformer_infer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbb68fbfe4a3285f08151f483708e98957276bf8bcc60ffa20150d821338d745
|
|
| MD5 |
c32d00d5b477e5fa987c96903fba4f0e
|
|
| BLAKE2b-256 |
964adbf94e60bdfc2631b78d57c3b1f0dc6aa551a1e63e8e5cc3e9591c0f7f43
|
Provenance
The following attestation bundles were made for melband_roformer_infer-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on openmirlab/melband-roformer-infer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
melband_roformer_infer-0.1.0-py3-none-any.whl -
Subject digest:
bbb68fbfe4a3285f08151f483708e98957276bf8bcc60ffa20150d821338d745 - Sigstore transparency entry: 729496316
- Sigstore integration time:
-
Permalink:
openmirlab/melband-roformer-infer@12a79563e52b5ae491d3ffe2ecc25d0a8ae7bc7b -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/openmirlab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@12a79563e52b5ae491d3ffe2ecc25d0a8ae7bc7b -
Trigger Event:
release
-
Statement type: