MLX-native audio stem separation for Apple Silicon

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ssmall256

These details have not been verified by PyPI

Project description

mlx-audio-separator

MLX-native stem separation for Apple Silicon Macs.

This project ports the inference paths from audio-separator (upstream repo: nomadkaraoke/python-audio-separator) to MLX so separation runs on Apple Silicon without requiring PyTorch or ONNX Runtime at inference time. Core runtime components are powered by mlx-audio-io (audio I/O) and mlx-spectro (spectral transforms).

Requirements

macOS 13+ (Ventura or later)
Apple Silicon (M1/M2/M3/M4)
Python 3.10+

Installation

pip install mlx-audio-separator

If you need first-run conversion from upstream checkpoints (.ckpt/.onnx/Demucs weights), install conversion extras:

pip install "mlx-audio-separator[convert]"

Quick Start

CLI

# Separate with default model
mlx-audio-separator song.mp3

# Use a specific model
mlx-audio-separator song.mp3 -m htdemucs_ft.yaml

# List supported models
mlx-audio-separator --list_models

Python

from mlx_audio_separator import Separator

sep = Separator()
sep.load_model()
outputs = sep.separate("song.mp3")
print(outputs)

Supported Architectures

Roformer (BS-Roformer and MelBand-Roformer families)
MDXC (including MDX23C-style checkpoints)
MDX
VR
Demucs

Validation Snapshot

Release validation snapshot (2026-02-24 to 2026-02-26):

Check	Result
Full-catalog benchmark gate	163/163 models `ok` (0 failures)
Unit tests	167 passed, 1 skipped
MLX vs `audio-separator` parity smoke	4/4 models passed (`rel L2 <= 5e-2`)

Scope: Apple Silicon (M4 mini), MUSDB18-HQ test subset, release gate + parity smoke model set.

Detailed evidence and provenance: docs/release-validation.md.

Performance Snapshot

MLX vs audio-separator (ABBA, 12-song MUSDB18-HQ test subset, M4 mini):

Model	MLX speedup vs PAS
`htdemucs_ft.yaml`	1.40x
`model_bs_roformer_ep_317_sdr_12.9755.ckpt`	2.16x
`mel_band_roformer_instrumental_instv7n_gabox.ckpt`	2.50x
`UVR-MDX-NET-Inst_HQ_3.onnx`	1.53x

Median speedup across the 4-model overlap set: 1.847x.

These numbers are scoped to the benchmark settings above and are not universal guarantees for all machines, models, or audio inputs.

Stable Runtime Tuning

Release-facing stable controls:

--speed_mode {default,latency_safe,latency_safe_v2,latency_safe_v3}
--cache_clear_policy {aggressive,deferred}
--write_workers <int>

Example:

mlx-audio-separator song.mp3 \
  --speed_mode latency_safe \
  --cache_clear_policy deferred \
  --write_workers 2

Basic benchmark command:

mlx-audio-separator \
  --benchmark song.mp3 \
  --benchmark_warmup 1 \
  --benchmark_repeats 3 \
  --benchmark_profile

BS-Roformer-SW Performance (Opt-In)

For BS-Roformer-SW.ckpt, use the opt-in no-drift FLAC profile:

mlx-audio-separator song.mp3 \
  -m BS-Roformer-SW.ckpt \
  --output_format FLAC \
  --speed_mode latency_safe_v3

latency_safe_v3 keeps model inference behavior conservative and focuses on safe end-to-end latency wins (deferred cache clearing + async stem writes).

To avoid repeated checkpoint conversion overhead, pre-convert once to *.safetensors and exit:

mlx-audio-separator \
  -m BS-Roformer-SW.ckpt \
  --save_converted_safetensors \
  --preconvert_only

safetensors primarily improves model load/startup time. It is not expected to materially change per-file inference latency.

Validation command (latency + deterministic equivalence):

uv run --with torch python scripts/perf/compare_latency.py \
  --corpus-file /tmp/corpus_one.txt \
  --baseline-config scripts/perf/configs/bs_roformer_sw_default_baseline.json \
  --candidate-config scripts/perf/configs/bs_roformer_sw_latency_safe_v3_candidate.json \
  --model-file-dir /tmp/audio-separator-models \
  --allow-speed-mode-mismatch \
  --target-improvement-demucs-mdxc 10.0 \
  --equivalence-check \
  --equivalence-threshold-rel-l2 1e-6 \
  --output-json /tmp/bs_roformer_sw_latency_safe_v3_compare.json \
  --output-markdown /tmp/bs_roformer_sw_latency_safe_v3_compare.md

BS-Roformer-SW Optimization Program (Opt-In Tracks)

As of March 4, 2026, latency_safe_v3 remains the only promoted runtime win for BS-Roformer-SW.ckpt; all experimental tracks below are parked pending new evidence.

Candidate configs for staged exploration live under scripts/perf/configs/:

bs_roformer_sw_cand_grouped_bandmask.json
bs_roformer_sw_cand_fused_ola.json
bs_roformer_sw_cand_stream_pipeline.json
bs_roformer_sw_cand_compile_fullgraph.json
bs_roformer_sw_cand_flac_fastwrite.json

Corpus manifest templates:

Quick gate (3 files): scripts/perf/corpora/bs_roformer_sw_quick.txt
Full gate (12 files): scripts/perf/corpora/bs_roformer_sw_full.txt

Quick-gate example:

uv run --with torch python scripts/perf/compare_latency.py \
  --corpus-file scripts/perf/corpora/bs_roformer_sw_quick.txt \
  --baseline-config scripts/perf/configs/bs_roformer_sw_latency_safe_v3_baseline.json \
  --candidate-config scripts/perf/configs/bs_roformer_sw_cand_grouped_bandmask.json \
  --model-file-dir /tmp/audio-separator-models \
  --target-improvement-demucs-mdxc 3.0 \
  --equivalence-check \
  --equivalence-threshold-rel-l2 1e-6 \
  --equivalence-max-files 1 \
  --output-json /tmp/bs_roformer_sw_quick_gate.json \
  --output-markdown /tmp/bs_roformer_sw_quick_gate.md

Full-gate example:

uv run --with torch python scripts/perf/compare_latency.py \
  --corpus-file scripts/perf/corpora/bs_roformer_sw_full.txt \
  --baseline-config scripts/perf/configs/bs_roformer_sw_latency_safe_v3_baseline.json \
  --candidate-config scripts/perf/configs/bs_roformer_sw_cand_grouped_bandmask.json \
  --model-file-dir /tmp/audio-separator-models \
  --target-improvement-demucs-mdxc 5.0 \
  --equivalence-check \
  --equivalence-threshold-rel-l2 1e-6 \
  --equivalence-max-files 0 \
  --output-json /tmp/bs_roformer_sw_full_gate.json \
  --output-markdown /tmp/bs_roformer_sw_full_gate.md

Documentation

Document	Description
`docs/release-validation.md`	Release evidence snapshot
`docs/release-first.md`	Release execution playbook
`docs/reproducibility.md`	Reproducibility guide
`docs/wave4-opt-in.md`	Wave 4 opt-in/experimental roadmap
`docs/bs-roformer-sw-optimization-program.md`	BS-Roformer-SW candidate gating and promotion table
`CHANGELOG.md`	Changelog
`THIRD_PARTY_NOTICES.md`	Third-party attribution and license notices

License

This project is MIT licensed.

Acknowledgments

mlx-audio-separator is derived from audio-separator (upstream repo: nomadkaraoke/python-audio-separator, MIT) by beveradb and the nomadkaraoke community. Substantial portions of the architecture, model loading, and separation logic are adapted from that project. If you find this package useful, please also star and support the upstream project.

The models used by this project were trained by the Ultimate Vocal Remover community, primarily @Anjok07 and @aufr33. See THIRD_PARTY_NOTICES.md for full attribution and license details.

Additional references:

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ssmall256

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.4

Mar 6, 2026

This version

0.1.3

Mar 6, 2026

0.1.2

Feb 27, 2026

0.1.2rc1 pre-release yanked

Feb 27, 2026

Reason this release was yanked:

mislabeled version string

0.1.1

Feb 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_audio_separator-0.1.3.tar.gz (315.0 kB view details)

Uploaded Mar 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlx_audio_separator-0.1.3-py3-none-any.whl (336.2 kB view details)

Uploaded Mar 6, 2026 Python 3

File details

Details for the file mlx_audio_separator-0.1.3.tar.gz.

File metadata

Download URL: mlx_audio_separator-0.1.3.tar.gz
Upload date: Mar 6, 2026
Size: 315.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlx_audio_separator-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`ddd84193a610583750581bf2a6b54348a58a60b7a198bce7d9721ef7f7398813`
MD5	`afacf17612b9241f4b05b09f28caffba`
BLAKE2b-256	`9d42892cca791ce7ce9b502bf59aa2a86bb9a0f68366c038ce9844f86d7e6eec`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_audio_separator-0.1.3.tar.gz:

Publisher: release-pypi.yml on ssmall256/mlx-audio-separator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mlx_audio_separator-0.1.3.tar.gz
- Subject digest: ddd84193a610583750581bf2a6b54348a58a60b7a198bce7d9721ef7f7398813
- Sigstore transparency entry: 1048804828
- Sigstore integration time: Mar 6, 2026
Source repository:
- Permalink: ssmall256/mlx-audio-separator@ddbf55f57f8ae4e3a0d89d9113d6a3fe17f02e1f
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ssmall256
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-pypi.yml@ddbf55f57f8ae4e3a0d89d9113d6a3fe17f02e1f
- Trigger Event: workflow_dispatch

File details

Details for the file mlx_audio_separator-0.1.3-py3-none-any.whl.

File metadata

Download URL: mlx_audio_separator-0.1.3-py3-none-any.whl
Upload date: Mar 6, 2026
Size: 336.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlx_audio_separator-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9ccbb5660359c8c797f3e2e0fbcedff5366aed30b2120c5914b705f19ed547a6`
MD5	`64d406333f7204d8d8a88601bcffb71e`
BLAKE2b-256	`ac39953ee217964e4de210d6415044c8c395ac4bd80ebe004a7434a7bbaae33f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_audio_separator-0.1.3-py3-none-any.whl:

Publisher: release-pypi.yml on ssmall256/mlx-audio-separator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mlx_audio_separator-0.1.3-py3-none-any.whl
- Subject digest: 9ccbb5660359c8c797f3e2e0fbcedff5366aed30b2120c5914b705f19ed547a6
- Sigstore transparency entry: 1048804831
- Sigstore integration time: Mar 6, 2026
Source repository:
- Permalink: ssmall256/mlx-audio-separator@ddbf55f57f8ae4e3a0d89d9113d6a3fe17f02e1f
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ssmall256
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-pypi.yml@ddbf55f57f8ae4e3a0d89d9113d6a3fe17f02e1f
- Trigger Event: workflow_dispatch

mlx-audio-separator 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

mlx-audio-separator

Requirements

Installation

Quick Start

CLI

Python

Supported Architectures

Validation Snapshot

Performance Snapshot

Stable Runtime Tuning

BS-Roformer-SW Performance (Opt-In)

BS-Roformer-SW Optimization Program (Opt-In Tracks)

Documentation

License

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance