Skip to main content

Inference-only Mel-Band Roformer vocal separation toolkit

Project description

MelBand-RoFormer-Infer

Production-ready, inference-only toolkit for Mel-Band RoFormer audio source separation

MelBand-RoFormer-Infer provides a clean, lightweight API for running music source separation inference using Mel-Band RoFormer models with automatic checkpoint management.

Python 3.10+ PyTorch License: MIT PyPI Open In Colab


Features

  • Inference Only: Lightweight package focused on production inference
  • Auto-Download: Automatic checkpoint downloads with integrity verification
  • 70+ Pre-trained Models: Vocals, instrumentals, karaoke, denoise, dereverb, and more
  • CLI Tools: melband-roformer-infer and melband-roformer-download commands
  • Python API: Clean programmatic interface
  • Model Registry: Easy model discovery with search and category filtering

Try it in Colab

No installation needed! Try the demo directly in Google Colab:

Open In Colab


Quick Start

Installation

# Using pip
pip install melband-roformer-infer

# Using UV (recommended)
uv pip install melband-roformer-infer

Download Models

# List available models
melband-roformer-download --list-models

# Download the recommended model (MelBand Roformer Kim)
melband-roformer-download --model melband-roformer-kim-vocals

# Download by category
melband-roformer-download --category karaoke --output-dir ./models

# Download all models
melband-roformer-download --all --output-dir ./models

CLI Inference

# Using the recommended MelBand Roformer Kim model
melband-roformer-infer \
  --config_path models/melband-roformer-kim-vocals/config_vocals_mel_band_roformer.yaml \
  --model_path models/melband-roformer-kim-vocals/MelBandRoformer.ckpt \
  --input_folder ./songs \
  --store_dir ./outputs

Every WAV inside input_folder produces *_vocals.wav and *_instrumental.wav stems.

Python API

from pathlib import Path
from ml_collections import ConfigDict
import torch
import yaml
from mel_band_roformer import MODEL_REGISTRY, DEFAULT_MODEL, get_model_from_config

# Use the default recommended model (MelBand Roformer Kim)
entry = MODEL_REGISTRY.get(DEFAULT_MODEL)

# Load config and model
config = ConfigDict(yaml.safe_load(open(f"models/{entry.slug}/{entry.config}")))
model = get_model_from_config("mel_band_roformer", config)
model.load_state_dict(torch.load(f"models/{entry.slug}/{entry.checkpoint}", map_location="cpu"))

Recommended Model

MelBand Roformer Kim (melband-roformer-kim-vocals) by Kimberley Jensen is the recommended default model for vocal separation. It provides excellent quality and is the foundation for many fine-tuned variants.

from mel_band_roformer import DEFAULT_MODEL
print(DEFAULT_MODEL)  # "melband-roformer-kim-vocals"

Available Models

Model Category Description
melband-roformer-kim-vocals vocals Recommended - Original MelBand Roformer by Kimberley Jensen
melband-roformer-big-beta6 vocals Big Beta 6 by unwa
roformer-model-melband-roformer-vocals-by-gabox vocals Vocals by Gabox
roformer-model-melband-roformer-instrumental-by-gabox instrumental Instrumental by Gabox
roformer-model-mel-roformer-karaoke-aufr33-viperx karaoke Karaoke by aufr33/viperx
roformer-model-mel-roformer-denoise-aufr33 denoise Denoise by aufr33
roformer-model-melband-roformer-de-reverb-by-anvuew dereverb De-Reverb by anvuew
... ... See --list-models for 70+ models

Categories: vocals, instrumental, karaoke, denoise, dereverb, crowd, general, aspiration


Registry Helpers

from mel_band_roformer import MODEL_REGISTRY

# List all categories
print(MODEL_REGISTRY.categories())

# List models by category
for model in MODEL_REGISTRY.list("vocals"):
    print(model.name, model.checkpoint)

# Search models
results = MODEL_REGISTRY.search("karaoke")
for m in results:
    print(m.slug)

# Pretty-print all models
print(MODEL_REGISTRY.as_table())

Development Installation

# Clone repository
git clone https://github.com/openmirlab/melband-roformer-infer.git
cd melband-roformer-infer

# Install with UV
uv sync

# Install with pip
pip install -e ".[dev]"

Acknowledgments

This project builds upon the excellent work of several open-source projects:

  • Mel-Band-Roformer-Vocal-Model by Kimberley Jensen - Original model and training
  • BS-RoFormer by Phil Wang (lucidrains) - PyTorch implementation of the RoFormer architecture
  • python-audio-separator by Andrew Beveridge (nomadkaraoke) - Pre-trained checkpoints and model configurations
  • Original Research - Wei-Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, and Yun-Ning Hung for the Band-Split RoPE Transformer paper

License

MIT License - see LICENSE for details.

This project includes code and configurations adapted from:

  • BS-RoFormer (MIT) - Phil Wang
  • python-audio-separator (MIT) - Andrew Beveridge
  • Mel-Band-Roformer-Vocal-Model - Kimberley Jensen

Citation

If you use MelBand-RoFormer-Infer in your research, please cite the original paper:

@inproceedings{Lu2023MusicSS,
    title   = {Music Source Separation with Band-Split RoPE Transformer},
    author  = {Wei-Tsung Lu and Ju-Chiang Wang and Qiuqiang Kong and Yun-Ning Hung},
    year    = {2023},
    url     = {https://api.semanticscholar.org/CorpusID:261556702}
}

Support

For issues and questions:


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

melband_roformer_infer-0.1.1.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

melband_roformer_infer-0.1.1-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file melband_roformer_infer-0.1.1.tar.gz.

File metadata

  • Download URL: melband_roformer_infer-0.1.1.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for melband_roformer_infer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0474c1df7ff7b84ea9e67d19956d90a335959a5418ddc0d095dbf228898268ca
MD5 387528176205da21353b549379c83e15
BLAKE2b-256 24a19e0a090f0df659b20ef9ad3408b59e32e3847ecc485fc56deebdecb3b89c

See more details on using hashes here.

Provenance

The following attestation bundles were made for melband_roformer_infer-0.1.1.tar.gz:

Publisher: publish.yml on openmirlab/melband-roformer-infer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file melband_roformer_infer-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for melband_roformer_infer-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7d3e32e8ab2cbed41aea76d3d49e91e749a73bc712ddded974f528c707de5d07
MD5 6ef1a6d6d6ef9157136b25e2a66dceef
BLAKE2b-256 7be1be273d7423a3f118cb201ae217b214eef4f76e0f12e85bb0b7fba7b169bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for melband_roformer_infer-0.1.1-py3-none-any.whl:

Publisher: publish.yml on openmirlab/melband-roformer-infer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page