Skip to main content

Lightweight utilities for music source separation.

Project description

SpliFFT

image image image Ruff MIT Licence

Lightweight utilities for music source separation.

This library is a ground-up rewrite of the zfturbo's MSST repo, with a strong focus on robustness, simplicity and extensibility. While it is a fantastic collection of models and training scripts, this rewrite adopts a different architecture to address common pain points in research code.

Key principles:

  • Configuration as code: we replace untyped dictionaries and ConfigDict with pydantic models. This provides static type safety, runtime data validation, IDE autocompletion, and a single, clear source of truth for all parameters.
  • Data-oriented and functional core: we avoid complex class hierarchies and inheritance. The codebase is built on plain data structures (like dataclasses) and pure, stateless functions.
  • Semantic typing as documentation: we leverage Python's type system to convey intent. Types like RawAudioTensor vs. NormalizedAudioTensor make function signatures self-documenting, reducing the need for verbose comments and ensuring correctness.
  • Extensibility without modification: new models can be integrated from external packages without altering the core library. The dynamic model loading system allows easy plug-and-play adhering to the open/closed principle.

⚠️ This is pre-alpha software, expect significant breaking changes.

Features and Roadmap

Short term (high priority)

  • a robust, typed JSON configuration system powered by pydantic
  • inferencing:
    • normalization and denormalization
    • chunk generation: vectorized with unfold
    • chunk stitching: vectorized overlap-add with fold
    • flexible ruleset for stem deriving: add/subtract model outputs or any intermediate output (e.g., creating an instrumental track by subtracting vocals from the mixture).
  • web-based docs: generated with mkdocs with excellent crossrefs.
  • simple CLI for inferencing on a directory of audio files
  • BS-Roformer: ensure bit-for-bit equivalence in pytorch and strive for max perf.
    • initial fp16 support
    • support coremltools and torch.compile
      • handroll complex multiplication implementation
      • isolate/handroll istft in forward pass
  • proper benchmarking (MFU, memory...)
  • implement evals: SDR, bleedless, fullness, etc.
  • simple file-based cache for model registry

Long term (low priority)

  • data augmentation
  • implement a complete, configurable training loop
  • port additional SOTA models from MSST (Mel Roformer, SCNet, etc.).
  • implement max kernels
  • simple web-based GUI with FastAPI and Svelte.

Contributing: PRs are very welcome!

Installation & Usage

Documentation on the config (amongst other details) can be found here

CLI

There are three steps. You do not need to have Python installed.

  1. Install uv if you haven't already. It is an awesome Python package and library manager with pip comptability.
# Linux / MacOS
wget -qO- https://astral.sh/uv/install.sh | sh
# Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
  1. Open a new terminal and install the current project as a tool. It will install the Python interpreter and all necessary packages if you haven't already:
uv tool install "git+https://github.com/undef13/splifft.git[config,inference,cli]"
  1. Go into a new directory and place the model checkpoint and configuration inside it. Assuming your current directory has this structure (doesn't have to be exactly this):
Grab an example audio from YouTube
uv tool install yt-dlp
yt-dlp -f bestaudio -o data/audio/input/3BFTio5296w.flac 3BFTio5296w
.
└── data
    ├── audio
    │   ├── input
    │   │   └── 3BFTio5296w.flac
    │   └── output
    ├── config
    │   └── bs_roformer.json
    └── models
        └── roformer-fp16.pt

Run:

splifft separate data/audio/input/3BFTio5296w.flac --config data/config/bs_roformer.json --checkpoint data/models/roformer-fp16.pt
Console output
[00:00:41] INFO     using device=device(type='cuda')                                                 __main__.py:117
           INFO     loading configuration from                                                       __main__.py:119
                    config_path=PosixPath('data/config/bs_roformer.json')                                           
           INFO     loading model metadata `BSRoformer` from module `splifft.models.bs_roformer`     __main__.py:122
[00:00:42] INFO     loading weights from checkpoint_path=PosixPath('data/models/roformer-fp16.pt')   __main__.py:131
           INFO     processing audio file:                                                           __main__.py:138
                    mixture_path=PosixPath('data/audio/input/3BFTio5296w.flac')                                     
[00:00:56] INFO     wrote stem `bass` to data/audio/output/3BFTio5296w/bass.flac                     __main__.py:168
           INFO     wrote stem `drums` to data/audio/output/3BFTio5296w/drums.flac                   __main__.py:168
           INFO     wrote stem `other` to data/audio/output/3BFTio5296w/other.flac                   __main__.py:168
[00:00:57] INFO     wrote stem `vocals` to data/audio/output/3BFTio5296w/vocals.flac                 __main__.py:168
           INFO     wrote stem `guitar` to data/audio/output/3BFTio5296w/guitar.flac                 __main__.py:168
           INFO     wrote stem `piano` to data/audio/output/3BFTio5296w/piano.flac                   __main__.py:168
[00:00:58] INFO     wrote stem `instrumental` to data/audio/output/3BFTio5296w/instrumental.flac     __main__.py:168
           INFO     wrote stem `drums_and_bass` to data/audio/output/3BFTio5296w/drums_and_bass.flac __main__.py:168

To update the tool:

uv tool upgrade splifft --force-reinstall

Library

Add the latest bleeding edge to your project:

uv add git+https://github.com/undef13/splifft.git

This only installs absolutely minimal core dependencies for the src/splifft/models/ directory. It does not enable inference, training or CLI components. You must install the optional dependencies defined in pyproject.toml, for example:

# enable the built-in configuration, inference and CLI
uv add "git+https://github.com/undef13/splifft.git[config,inference,cli]"

Development

For a local dev build enabling all optional and developer dependencies:

git clone https://github.com/undef13/splifft.git
cd splifft
uv venv
uv sync --all-extras --all-groups

If you're using splifft from another project, you may also want to use --editable.

# lint
uv run ruff check src tests
# format
uv run ruff format --check src tests
# build & host documentation
uv run mkdocs serve
# type check
uv run mypy src tests

This repo is no longer compatible with zfturbo's repo. The last version that does so is v0.0.1. To pin a specific version in uv, change your pyproject.toml:

[tool.uv.sources]
splifft = { git = "https://github.com/undef13/splifft.git", rev = "287235e520f3bb927b58f9f53749fe3ccc248fac" }

Mojo

While the primary goal is just to have minimalist PyTorch-based inference engine, I may be using this project as an opportunity to learn more about heterogenous computing, particularly with the Mojo language. The ultimate goal will be to understand to what extent can its compile-time metaprogramming and explicit memory layout control be used in BSRoformer.

My approach will be incremental and bottom-up: I'll develop, test benchmark against their PyTorch counterparts. The PyTorch implementation will always remain the "source of truth", fully functional baseline and not be removed.

TODO:

  • evaluate pixi in pyproject.toml.
  • use max.torch.CustomOpLibrary to provide a callable from the pytorch side
  • use DeviceContext to interact with the GPU
  • attention
  • rotary embedding
  • feedforward
  • transformer
  • BandSplit & MaskEstimator
  • full graph compilation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

splifft-0.0.2.tar.gz (506.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

splifft-0.0.2-py3-none-any.whl (40.3 kB view details)

Uploaded Python 3

File details

Details for the file splifft-0.0.2.tar.gz.

File metadata

  • Download URL: splifft-0.0.2.tar.gz
  • Upload date:
  • Size: 506.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for splifft-0.0.2.tar.gz
Algorithm Hash digest
SHA256 208b7586f3f968fec5168667d7aa5ab1c8a499a4d8cb779a3fc836c8a0329e9e
MD5 bfd8e72fe8fe0f79613e4bdc7e93a77a
BLAKE2b-256 7490f6cdedb5e02a585b32ee5f3f940cc4128286a4f93d16b1f82d6d516a0e3e

See more details on using hashes here.

Provenance

The following attestation bundles were made for splifft-0.0.2.tar.gz:

Publisher: pypi.yml on undef13/splifft

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file splifft-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: splifft-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 40.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for splifft-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7d2e2f12a6b90b208bb1f7f4ce17db3f166a55a975e916381dbe39a1dd9005c8
MD5 c77a52da3d77fa9aa8a09b7a4cb4fd50
BLAKE2b-256 a3b7a15266fe9b0472fdbc9106316f645701df58e240d99694067737376c01aa

See more details on using hashes here.

Provenance

The following attestation bundles were made for splifft-0.0.2-py3-none-any.whl:

Publisher: pypi.yml on undef13/splifft

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page