High-performance STFT/iSTFT for Apple MLX with fused Metal kernels
Project description
mlx-spectro
High-performance STFT/iSTFT for Apple MLX — 2–3x faster STFT and 5–8x faster iSTFT than torch.stft/torch.istft on MPS, via fused Metal kernels.
from mlx_spectro import SpectralTransform
transform = SpectralTransform(n_fft=2048, hop_length=512, window_fn="hann")
spec = transform.stft(audio) # [B, T] → complex spectrogram
reconstructed = transform.istft(spec, length=T) # complex spectrogram → [B, T]
mlx-audio-separator uses mlx-spectro for MLX-native stem separation (Roformer, MDX, Demucs) and runs 1.8–3.1x faster end-to-end than python-audio-separator on torch+MPS. See benchmarks below.
Install
pip install mlx-spectro
With optional torch fallback support:
pip install mlx-spectro[torch]
Features
- Fused overlap-add with autotuned Metal kernels
- PyTorch-compatible STFT/iSTFT semantics
- Cached transforms for zero-overhead repeated calls
- Differentiable transforms for training with
mx.grad mx.compile-friendly for tight inference loops- Optional torch fallback for strict numerical parity
Quick Start
import mlx.core as mx
from mlx_spectro import SpectralTransform
transform = SpectralTransform(
n_fft=2048,
hop_length=512,
window_fn="hann",
)
audio = mx.random.normal((1, 44100))
spec = transform.stft(audio, output_layout="bnf")
reconstructed = transform.istft(spec, length=44100, input_layout="bnf")
API
SpectralTransform
Main class for STFT/iSTFT operations.
SpectralTransform(
n_fft: int,
hop_length: int,
win_length: int | None = None,
window_fn: str = "hann", # "hann", "hamming", "rect"
window: mx.array | None = None, # custom window array
periodic: bool = True,
center: bool = True,
normalized: bool = False,
istft_backend_policy: str | None = None, # "auto", "mlx_fft", "metal", "torch_fallback"
)
Methods:
stft(x, output_layout="bfn")— Forward STFT. Input:[T]or[B, T].istft(z, length=None, ...)— Inverse STFT. Returns[B, T].compiled_pair(length, layout="bnf", warmup_batch=None)— Return compiled(stft_fn, istft_fn)for steady-state loops (10–20% faster).warmup(batch=1, length=4096)— Force kernel compilation.
get_transform_mlx(**kwargs)
Factory that returns cached SpectralTransform instances for repeated use.
make_window(window, window_fn, win_length, n_fft, periodic)
Create or validate a 1D analysis window.
resolve_fft_params(n_fft, hop_length, win_length, pad)
Resolve effective FFT parameters with PyTorch-compatible defaults.
Benchmarks
Apple M4 Max, macOS 26.3, MLX 0.30.6, PyTorch 2.10.0, 20 iterations (5 warmup).
STFT Forward
| Config | mlx-spectro | torch MPS | mlx-stft | vs torch | vs mlx-stft |
|---|---|---|---|---|---|
| B=1 T=16k nfft=512 | 0.16 ms | 0.21 ms | 0.31 ms | 1.4x | 1.9x |
| B=4 T=160k nfft=1024 | 0.37 ms | 1.00 ms | 1.09 ms | 2.7x | 3.0x |
| B=8 T=160k nfft=1024 | 0.28 ms | 0.71 ms | 1.53 ms | 2.5x | 5.6x |
| B=4 T=1.3M nfft=1024 | 0.77 ms | 2.18 ms | 5.03 ms | 2.8x | 6.5x |
| B=8 T=480k nfft=1024 | 0.58 ms | 1.30 ms | 3.73 ms | 2.2x | 6.4x |
iSTFT Forward
| Config | mlx-spectro | torch MPS | mlx-stft | vs torch | vs mlx-stft |
|---|---|---|---|---|---|
| B=1 T=16k nfft=512 | 0.17 ms | 0.49 ms | 0.25 ms | 3.0x | 1.5x |
| B=4 T=160k nfft=1024 | 0.21 ms | 1.00 ms | 0.98 ms | 4.7x | 4.7x |
| B=8 T=160k nfft=1024 | 0.30 ms | 1.61 ms | 1.62 ms | 5.4x | 5.4x |
| B=4 T=1.3M nfft=1024 | 0.81 ms | 5.76 ms | 6.68 ms | 7.1x | 8.2x |
| B=8 T=480k nfft=1024 | 0.60 ms | 4.10 ms | 4.55 ms | 6.8x | 7.6x |
Roundtrip (STFT → iSTFT) Forward + Backward
| Config | mlx-spectro | torch MPS | vs torch |
|---|---|---|---|
| B=4 T=160k nfft=1024 | 0.62 ms | 2.25 ms | 3.6x |
| B=8 T=160k nfft=1024 | 1.04 ms | 4.38 ms | 4.2x |
| B=4 T=480k nfft=1024 | 1.59 ms | 6.59 ms | 4.1x |
| B=4 T=1.3M nfft=1024 | 4.33 ms | 17.63 ms | 4.1x |
| B=1 T=1.3M nfft=1024 | 1.21 ms | 4.20 ms | 3.5x |
Roundtrip Accuracy (STFT → iSTFT max abs error)
| Config | mlx-spectro | torch MPS |
|---|---|---|
| B=1 T=16k nfft=512 | 1.67e-06 | 2.38e-06 |
| B=4 T=160k nfft=2048 | 2.86e-06 | 5.25e-06 |
| B=8 T=480k nfft=1024 | 3.81e-06 | 4.77e-06 |
To reproduce: python scripts/benchmark.py
Real-world: mlx-audio-separator
mlx-audio-separator is an MLX-native music stem separation library supporting Roformer, MDX, Demucs, and more. End-to-end separation speedup vs python-audio-separator (torch on MPS), measured on 30s stereo 44.1 kHz tracks. Apple M4 Max, PyTorch 2.10.0, MLX 0.30.6, ABBA ordering, 2 repeats.
| Model | Arch | torch+MPS (s) | MLX (s) | E2E speedup |
|---|---|---|---|---|
| UVR-MDX-NET-Inst_HQ_3 | MDX | 4.25 | 1.36 | 3.1x |
| htdemucs | Demucs | 3.35 | 1.29 | 2.6x |
| Mel-Roformer Karaoke | MDXC | 5.60 | 2.66 | 2.1x |
| BS-Roformer | MDXC | 6.48 | 3.56 | 1.8x |
STFT/iSTFT kernel speedups within these pipelines are even larger (2–3x STFT, 5–8x iSTFT vs torch).
Compiled Mode
For tight inference loops with fixed input shapes, compiled_pair eliminates
per-call Python dispatch overhead (10–20% faster for small workloads):
t = SpectralTransform(n_fft=1024, hop_length=256, window_fn="hann")
stft, istft = t.compiled_pair(length=44100, warmup_batch=2)
for chunk in audio_stream:
z = stft(chunk)
z = process(z)
y = istft(z)
mx.eval(y)
Use the eager t.stft() / t.istft() methods when input shapes vary.
Environment Variables
| Variable | Default | Description |
|---|---|---|
SPEC_MLX_AUTOTUNE |
1 |
Enable Metal kernel autotuning |
SPEC_MLX_TGX |
— | Force threadgroup size (e.g. 256 or kernel:256) |
SPEC_MLX_AUTOTUNE_PERSIST |
1 |
Persist autotune results to disk |
SPEC_MLX_AUTOTUNE_CACHE_PATH |
— | Override autotune cache file path |
MLX_OLA_FUSE_NORM |
1 |
Enable fused OLA+normalization kernel |
SPEC_MLX_CACHE_STATS |
0 |
Enable cache debug counters |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlx_spectro-0.2.2.tar.gz.
File metadata
- Download URL: mlx_spectro-0.2.2.tar.gz
- Upload date:
- Size: 76.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c58c93dde8d717a564c97ba0bfcf2c7558fdeaa5c5103adedca2bd707c2c3647
|
|
| MD5 |
0a3bf58e95c13a9de23caa4c3d7bf276
|
|
| BLAKE2b-256 |
d7763f7bd9e6dcb2b597a9f300ae0d467d7452e7302b8c982e010cc55e9c8906
|
Provenance
The following attestation bundles were made for mlx_spectro-0.2.2.tar.gz:
Publisher:
release-pypi.yml on ssmall256/mlx-spectro
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mlx_spectro-0.2.2.tar.gz -
Subject digest:
c58c93dde8d717a564c97ba0bfcf2c7558fdeaa5c5103adedca2bd707c2c3647 - Sigstore transparency entry: 992582252
- Sigstore integration time:
-
Permalink:
ssmall256/mlx-spectro@62675e691b767a6fd2c4d1a64e006e3e76c8d4ef -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ssmall256
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@62675e691b767a6fd2c4d1a64e006e3e76c8d4ef -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file mlx_spectro-0.2.2-py3-none-any.whl.
File metadata
- Download URL: mlx_spectro-0.2.2-py3-none-any.whl
- Upload date:
- Size: 29.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a7017e54ed0538cf7b0d53eec5e2661b0cbcf0ecde4c758529024ad3215ec99
|
|
| MD5 |
b0d51458d17e5d60c1324e9ef41577f1
|
|
| BLAKE2b-256 |
5d9fa8c98d12ed7699c772523270e599c3fff67100dbe77811bb968cbd752af2
|
Provenance
The following attestation bundles were made for mlx_spectro-0.2.2-py3-none-any.whl:
Publisher:
release-pypi.yml on ssmall256/mlx-spectro
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mlx_spectro-0.2.2-py3-none-any.whl -
Subject digest:
0a7017e54ed0538cf7b0d53eec5e2661b0cbcf0ecde4c758529024ad3215ec99 - Sigstore transparency entry: 992582253
- Sigstore integration time:
-
Permalink:
ssmall256/mlx-spectro@62675e691b767a6fd2c4d1a64e006e3e76c8d4ef -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ssmall256
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@62675e691b767a6fd2c4d1a64e006e3e76c8d4ef -
Trigger Event:
workflow_dispatch
-
Statement type: