Skip to main content

Inference-only Mel-Band Roformer vocal separation toolkit

Project description

MelBand-RoFormer-Infer

Production-ready, inference-only toolkit for Mel-Band RoFormer audio source separation

MelBand-RoFormer-Infer provides a clean, lightweight API for running music source separation inference using Mel-Band RoFormer models with automatic checkpoint management.

Python 3.10+ PyTorch License: MIT PyPI


Features

  • Inference Only: Lightweight package focused on production inference
  • Auto-Download: Automatic checkpoint downloads with integrity verification
  • 70+ Pre-trained Models: Vocals, instrumentals, karaoke, denoise, dereverb, and more
  • CLI Tools: melband-roformer-infer and melband-roformer-download commands
  • Python API: Clean programmatic interface
  • Model Registry: Easy model discovery with search and category filtering

Quick Start

Installation

# Using pip
pip install melband-roformer-infer

# Using UV (recommended)
uv pip install melband-roformer-infer

Download Models

# List available models
melband-roformer-download --list-models

# Download the recommended model (MelBand Roformer Kim)
melband-roformer-download --model melband-roformer-kim-vocals

# Download by category
melband-roformer-download --category karaoke --output-dir ./models

# Download all models
melband-roformer-download --all --output-dir ./models

CLI Inference

# Using the recommended MelBand Roformer Kim model
melband-roformer-infer \
  --config_path models/melband-roformer-kim-vocals/config_vocals_mel_band_roformer.yaml \
  --model_path models/melband-roformer-kim-vocals/MelBandRoformer.ckpt \
  --input_folder ./songs \
  --store_dir ./outputs

Every WAV inside input_folder produces *_vocals.wav and *_instrumental.wav stems.

Python API

from pathlib import Path
from ml_collections import ConfigDict
import torch
import yaml
from mel_band_roformer import MODEL_REGISTRY, DEFAULT_MODEL, get_model_from_config

# Use the default recommended model (MelBand Roformer Kim)
entry = MODEL_REGISTRY.get(DEFAULT_MODEL)

# Load config and model
config = ConfigDict(yaml.safe_load(open(f"models/{entry.slug}/{entry.config}")))
model = get_model_from_config("mel_band_roformer", config)
model.load_state_dict(torch.load(f"models/{entry.slug}/{entry.checkpoint}", map_location="cpu"))

Recommended Model

MelBand Roformer Kim (melband-roformer-kim-vocals) by Kimberley Jensen is the recommended default model for vocal separation. It provides excellent quality and is the foundation for many fine-tuned variants.

from mel_band_roformer import DEFAULT_MODEL
print(DEFAULT_MODEL)  # "melband-roformer-kim-vocals"

Available Models

Model Category Description
melband-roformer-kim-vocals vocals Recommended - Original MelBand Roformer by Kimberley Jensen
melband-roformer-big-beta6 vocals Big Beta 6 by unwa
roformer-model-melband-roformer-vocals-by-gabox vocals Vocals by Gabox
roformer-model-melband-roformer-instrumental-by-gabox instrumental Instrumental by Gabox
roformer-model-mel-roformer-karaoke-aufr33-viperx karaoke Karaoke by aufr33/viperx
roformer-model-mel-roformer-denoise-aufr33 denoise Denoise by aufr33
roformer-model-melband-roformer-de-reverb-by-anvuew dereverb De-Reverb by anvuew
... ... See --list-models for 70+ models

Categories: vocals, instrumental, karaoke, denoise, dereverb, crowd, general, aspiration


Registry Helpers

from mel_band_roformer import MODEL_REGISTRY

# List all categories
print(MODEL_REGISTRY.categories())

# List models by category
for model in MODEL_REGISTRY.list("vocals"):
    print(model.name, model.checkpoint)

# Search models
results = MODEL_REGISTRY.search("karaoke")
for m in results:
    print(m.slug)

# Pretty-print all models
print(MODEL_REGISTRY.as_table())

Development Installation

# Clone repository
git clone https://github.com/openmirlab/melband-roformer-infer.git
cd melband-roformer-infer

# Install with UV
uv sync

# Install with pip
pip install -e ".[dev]"

Acknowledgments

This project builds upon the excellent work of several open-source projects:

  • Mel-Band-Roformer-Vocal-Model by Kimberley Jensen - Original model and training
  • BS-RoFormer by Phil Wang (lucidrains) - PyTorch implementation of the RoFormer architecture
  • python-audio-separator by Andrew Beveridge (nomadkaraoke) - Pre-trained checkpoints and model configurations
  • Original Research - Wei-Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, and Yun-Ning Hung for the Band-Split RoPE Transformer paper

License

MIT License - see LICENSE for details.

This project includes code and configurations adapted from:

  • BS-RoFormer (MIT) - Phil Wang
  • python-audio-separator (MIT) - Andrew Beveridge
  • Mel-Band-Roformer-Vocal-Model - Kimberley Jensen

Citation

If you use MelBand-RoFormer-Infer in your research, please cite the original paper:

@inproceedings{Lu2023MusicSS,
    title   = {Music Source Separation with Band-Split RoPE Transformer},
    author  = {Wei-Tsung Lu and Ju-Chiang Wang and Qiuqiang Kong and Yun-Ning Hung},
    year    = {2023},
    url     = {https://api.semanticscholar.org/CorpusID:261556702}
}

Support

For issues and questions:


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

melband_roformer_infer-0.1.0.tar.gz (20.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

melband_roformer_infer-0.1.0-py3-none-any.whl (24.6 kB view details)

Uploaded Python 3

File details

Details for the file melband_roformer_infer-0.1.0.tar.gz.

File metadata

  • Download URL: melband_roformer_infer-0.1.0.tar.gz
  • Upload date:
  • Size: 20.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for melband_roformer_infer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0f569583c3a1a7e30622f59604b771ba688f00e16bd50e06cb6e7037d8475704
MD5 51c856371ec4c796fbe9caaa33f956bb
BLAKE2b-256 6f50fdee659a23892b9cdaf917ab02bcd8bb74f876b8fc966915a65c0a803339

See more details on using hashes here.

Provenance

The following attestation bundles were made for melband_roformer_infer-0.1.0.tar.gz:

Publisher: publish.yml on openmirlab/melband-roformer-infer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file melband_roformer_infer-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for melband_roformer_infer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bbb68fbfe4a3285f08151f483708e98957276bf8bcc60ffa20150d821338d745
MD5 c32d00d5b477e5fa987c96903fba4f0e
BLAKE2b-256 964adbf94e60bdfc2631b78d57c3b1f0dc6aa551a1e63e8e5cc3e9591c0f7f43

See more details on using hashes here.

Provenance

The following attestation bundles were made for melband_roformer_infer-0.1.0-py3-none-any.whl:

Publisher: publish.yml on openmirlab/melband-roformer-infer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page