Flexible frequency-band splitter for music source separation. The BandSplitRoformer module supports BSRoformer, MelBandRoformer, and custom overlapping or non-overlapping band configurations. Fully typed, modular, and documented, including migration help, usage, and paper references. PyTorch; CUDA-accelerated.
Project description
hunterFormsBS
A flexible frequency-band splitter for music source separation, organized around a single separator family that can express BS-style, mel-style, and custom layouts.
Instead of treating BSRoformer and MelBandRoformer as separate architectures, this package treats them as different band-layout configurations of one core design centered on BandSplitRotator.
The codebase is implemented in PyTorch, fully typed (py.typed), and designed for modular reuse so research ideas (for example PoPE or custom filter banks) can be integrated without splitting into parallel implementations.
Quick fix: size mismatch when loading a checkpoint
If loading a BSRoFormer checkpoint raises a size-mismatch error, check mask_estimator_depth in the configuration.
Some upstream configurations effectively used mask_estimator_depth=1 even when set to 2 because a later subtraction was applied. This package removes that subtraction, so the direct equivalent is:
- set
mask_estimator_depth=1
Updating that value resolves the most common mismatch quickly.
Why this architecture helps in practice
- Forward-looking architecture: A single model family makes it easier to adopt new ideas, such as PoPE or custom band-split definitions, while keeping interfaces aligned with established ecosystems.
- Universal configuration: Configurable backward compatibility with existing standards.
- Rich tooling & Ecosystem: The package provides strong typing (
py.typed), modular APIs, and rich docstrings focusing on usage, literature citations, and migration paths.
Easy to migrate
Transitioning from other standard implementations is straightforward because most identifiers are exactly the same and the data flow is highly similar.
If you're changing from an existing codebase, you can use the transition modules: simply keep using the BSRoformer and MelBandRoformer namespaces and APIs as a bridge, unify your other classes, and then switch to BandSplitRotator when you're ready.
What is unified here
The key design idea is that the difference between the BS-style front end and the mel-band front end is treated as a band-layout problem, not as a reason to maintain two unrelated model families.
hunterFormsBS.bandSplitRotator.BandSplitRotatoris the new universal entry point.hunterFormsBS.bs_roformer.BSRoformerandhunterFormsBS.mel_band_roformer.MelBandRoformerserve as transition modules, keeping familiar APIs, upstream names, and defaults.hunterFormsBS.bandSplit.BandSplit,hunterFormsBS.bandSplit.MaskEstimator, andhunterFormsBS.attend.Transformerhold the reusable typed building blocks shared across those entry points.
At the band level, the model only needs a band-membership map, called mask_filter_bank in the
codebase. You can think of that map as a Boolean matrix
$$ F \in {0, 1}^{B \times N_f} $$
where $B$ is the number of bands and $N_f$ is the number of STFT frequency bins.
- In a non-overlapping BS-style layout, each frequency bin belongs to exactly one band, so
$$ \forall f,; \sum_b F_{b,f} = 1. $$
- In an overlapping mel-style layout, some frequency bins belong to more than one band, so
$$ \exists f \text{ such that } \sum_b F_{b,f} > 1. $$
When bands overlap, the reconstructed mask for a frequency bin is averaged across the contributing bands:
$$ \hat{M}{f,t} = \frac{1}{S_f} \sum{b : F_{b,f} = 1} \hat{M}^{(b)}{f,t}, \qquad S_f = \sum_b F{b,f}. $$
That is why this package makes it easy to move between overlapping and non-overlapping bands, and to change how bands are distributed across the frequency axis. The architectural difference lives in the filter bank, not in two separate theories of the model.
Which entry point you should use
| Use this | When | Why |
|---|---|---|
hunterFormsBS.BandSplitRotator |
You are starting new work or want one separator that can cover BS-style, mel-style, and custom band layouts. | This is the unified model entry point. |
hunterFormsBS.bs_roformer.BSRoformer |
You want the familiar non-overlapping BS-style interface or a close comparison with upstream BS-RoFormer code. | The constructor keeps BS-oriented defaults and compatibility fields. |
hunterFormsBS.mel_band_roformer.MelBandRoformer |
You want the familiar mel-band interface or a close comparison with upstream mel-band code. | The constructor keeps mel-oriented defaults and automatic mel-band construction. |
hunterFormsBS.*_experimental |
You are testing research ideas such as value residual learning or hyper-connections. | These modules are exploratory and intentionally separate from the stable path. |
Custom mask_filter_bank helpers
Most users never need this section. The package already bundles the common lucidrains-style
mel-band split as hunterFormsBS.bandSplit.mask_filter_bank_mel_band_default, and the separator
constructors use that value automatically for sample_rate=44100, stft_n_fft=2048, and
num_bands=60.
If a checkpoint uses a different band layout, pass mask_filter_bank explicitly. For ad-hoc
generation, import a function from hunterFormsBS.make_static_mask_filter_bank in Python and call
the function from a REPL, notebook, or one-off script. There is intentionally no CLI for this
module. librosa is only needed if you call librosa_filters_mel.
filter_bank_non_overlappingprints a static non-overlapping band split fromfreqs_per_bands.librosa_filters_melprints a static mel-band split usinglibrosa.filters.mel.print_static_maskprints the compacttorch.tensor(...)assignment used by the other helpers.
Package map
hunterFormsBS.__init__- Main symbols:
BandSplitRotator,BandSplit,MaskEstimator,Transformer,lossComputation,DEFAULT_FREQS_PER_BANDS,ParametersComputeLoss,FlashAttentionConfig,ParametersAttention,ParametersSTFT,ParametersTransformer - Purpose: public top-level namespace for the stable typed API.
- Main symbols:
hunterFormsBS.bandSplitRotator- Main symbols:
BandSplitRotator - Purpose: unified separator that can build BS-style, mel-style, or custom band layouts from one model family.
- Main symbols:
hunterFormsBS.bs_roformer- Main symbols:
BSRoformer - Purpose: stable compatibility module for the non-overlapping BS-style variant.
- Main symbols:
hunterFormsBS.mel_band_roformer- Main symbols:
MelBandRoformer - Purpose: stable compatibility module for the overlapping mel-band variant.
- Main symbols:
hunterFormsBS.make_static_mask_filter_bank- Main symbols:
filter_bank_non_overlapping,librosa_filters_mel,print_static_mask - Purpose: ad-hoc helper module that prints paste-ready static
mask_filter_bankdefinitions for custom layouts.
- Main symbols:
hunterFormsBS.bandSplit- Main symbols:
BandSplit,MaskEstimator,MLP,lossComputation,DEFAULT_FREQS_PER_BANDS - Purpose: shared band projection, mask-estimation heads, BS-style default partition, and training-loss helper.
- Main symbols:
hunterFormsBS.attend- Main symbols:
Attend,Attention,FeedForward,LinearAttention,Transformer - Purpose: stable attention, feedforward, linear-attention, and transformer building blocks.
- Main symbols:
hunterFormsBS.theTypes- Main symbols:
ParametersComputeLoss,FlashAttentionConfig,ParametersAttention,ParametersSTFT,ParametersTransformer - Purpose: typed configuration records used across the package.
- Main symbols:
Experimental module map
| Module | Main symbols | Purpose |
|---|---|---|
hunterFormsBS.attend_experimental |
experimental Attention, experimental Transformer |
Research-oriented attention blocks with value-residual mixing and hyper-connection support. |
hunterFormsBS.bs_roformer_experimental |
experimental BSRoformer |
Experimental BS-style separator that uses the experimental attention stack. |
hunterFormsBS.mel_band_roformer_experimental |
experimental MelBandRoformer |
Experimental mel-band separator that uses the experimental attention stack. |
Architecture in one sentence
The stable separator path is
raw audio → STFT → band gathering → BandSplit → hierarchical attention → MaskEstimator → mask
followed by overlap-aware mask averaging when needed, complex masking in the STFT domain, and inverse STFT reconstruction back to waveform audio.
Top-level exports
The top-level package namespace currently re-exports the stable shared pieces that new users most often need:
BandSplitRotator
The compatibility classes are intentionally available from their own modules so that imports can stay explicit during comparisons with upstream repos.
Ad-hoc helpers such as hunterFormsBS.make_static_mask_filter_bank stay as explicit submodule
imports so the main namespace remains small and optional dependencies stay optional.
Reference materials
Gaussian Error Linear Units (GELUs)
- BibTeX citation. TeX Source with precise formulas for AI agents.
- eprint: arXiv.1606.08415
- Implementations:
Language Modeling with Gated Convolutional Networks
- Common name: GLU (Gated Linear Units)
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.mlr.press
Attention Is All You Need
(^Which is why there are no other papers on this list.)
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.neurips.cc
Root Mean Square Layer Normalization
- Common name: RMSNorm
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.neurips.cc
- Implementations:
- bzhangGo/rmsnorm
- hunterhogan/torch_einops_kit.scaleValues.RMSNorm
RoFormer: Enhanced Transformer with Rotary Position Embedding
- Common name: RoPE
- BibTeX citation. TeX Source with precise formulas for AI agents.
- DOI: 10.1016/j.neucom.2023.127063
- Free pre-print: arXiv:2104.09864
- Implementations:
XCiT: Cross-Covariance Image Transformers
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.neurips.cc
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.neurips.cc
- Implementations:
Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation
- BibTeX citation.
- Proceedings: ICLR 2022 Conference
- Implementations:
Music Source Separation with Band-Split RoPE Transformer
- Common name: BS-RoFormer
- BibTeX citation. TeX Source with precise formulas for AI agents.
- DOI: 10.1109/ICASSP48485.2024.10446843
- Free pre-print: arXiv:2309.02612
- Implementations:
Mel-RoFormer for Vocal Separation and Vocal Melody Transcription
- Common name: MelBand-RoFormer
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: 10.5281/zenodo.14877371
- Implementations:
Value Residual Learning
- Common name: ResFormer
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: 10.18653/v1/2025.acl-long.1375
- Implementations:
Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings
- Common name: PoPE
- BibTeX citation. TeX Source with precise formulas for AI agents.
- eprint: arXiv.2509.10534
- Implementations:
Packages and documentation
- pytorch/pytorch
- ZFTurbo/Music-Source-Separation-Training
- lucidrains/torch-einops-utils
- hunterhogan/torch_einops_kit
My recovery
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file hunterformsbs-0.1.5.tar.gz.
File metadata
- Download URL: hunterformsbs-0.1.5.tar.gz
- Upload date:
- Size: 74.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b60fe944a0e8adf5de203f5f29d0bddd0a7dcf55b7c80ecd4de6c6ba89947ca
|
|
| MD5 |
4a1a9140785b198655e66ba009b03c21
|
|
| BLAKE2b-256 |
0119a5d457b4a371424784e8f7d72e4c1006124eef6a74939d263ff1f278565f
|