Skip to main content

Flexible frequency-band splitter for music source separation. The BandSplitRoformer module supports BSRoformer, MelBandRoformer, and custom overlapping or non-overlapping band configurations. Fully typed, modular, and documented, including migration help, usage, and paper references. PyTorch; CUDA-accelerated.

Project description

hunterFormsBS

A flexible frequency-band splitter for music source separation, organized around a single separator family that can express BS-style, mel-style, and custom layouts.

Instead of treating BSRoformer and MelBandRoformer as separate architectures, this package treats them as different band-layout configurations of one core design centered on BandSplitRotator.

pip install hunterFormsBS uv add hunterFormsBS

The codebase is implemented in PyTorch, fully typed (py.typed), and designed for modular reuse so options such as PoPE, custom filter banks, value-residual learning, residual streams, and optional SageAttention acceleration can live on one aligned constructor surface.

Quick fix: size mismatch when loading a checkpoint

If loading a BSRoFormer checkpoint raises a size-mismatch error, check mask_estimator_depth in the configuration.

Some upstream configurations effectively used mask_estimator_depth=1 even when set to 2 because a later subtraction was applied. This package removes that subtraction, so the direct equivalent is:

  • set mask_estimator_depth=1

Updating that value resolves the most common mismatch quickly.

Why this architecture helps in practice

  • Forward-looking architecture: A single model family makes it easier to adopt new ideas, such as PoPE or custom band-split definitions, while keeping interfaces aligned with established ecosystems.
  • Universal configuration: BandSplitRotator, BSRoformer, and MelBandRoformer share downstream option names for attention, transformer, mask-estimator, STFT, and loss settings.
  • Rich tooling and ecosystem: The package provides strong typing (py.typed), modular APIs, and rich docstrings focusing on usage, literature citations, and migration paths.

Easy to migrate

Transitioning from other standard implementations is straightforward because most identifiers are exactly the same and the data flow is highly similar.

If you're changing from an existing codebase, you can use the transition modules: simply keep using the BSRoformer and MelBandRoformer namespaces and APIs as a bridge, unify your other classes, and then switch to BandSplitRotator when you're ready.

What is unified here

The key design idea is that the difference between the BS-style front end and the mel-band front end is treated as a band-layout problem, not as a reason to maintain two unrelated model families.

  • hunterFormsBS.bandSplitRotator.BandSplitRotator is the primary unified entry point.
  • hunterFormsBS.bs_roformer.BSRoformer and hunterFormsBS.mel_band_roformer.MelBandRoformer serve as transition modules, keeping familiar APIs, upstream names, and defaults.
  • hunterFormsBS.bandSplit.BandSplit, hunterFormsBS.bandSplit.MaskEstimator, and hunterFormsBS.attend.Transformer hold the reusable typed building blocks shared across those entry points.
  • Attention and transformer options such as attn_dropout, ff_dropout, flash_attn, sage_attention, scale, num_residual_streams, and use_value_residual_learning keep the same identifiers as they move from model constructors into downstream blocks.

At the band level, the model only needs a band-membership map, called mask_filter_bank in the codebase. You can think of that map as a Boolean matrix

$$ F \in {0, 1}^{B \times N_f} $$

where $B$ is the number of bands and $N_f$ is the number of STFT frequency bins.

  • In a non-overlapping BS-style layout, each frequency bin belongs to exactly one band, so

$$ \forall f,; \sum_b F_{b,f} = 1. $$

  • In an overlapping mel-style layout, some frequency bins belong to more than one band, so

$$ \exists f \text{ such that } \sum_b F_{b,f} > 1. $$

When bands overlap, the reconstructed mask for a frequency bin is averaged across the contributing bands:

$$ \hat{M}{f,t} = \frac{1}{S_f} \sum{b : F_{b,f} = 1} \hat{M}^{(b)}{f,t}, \qquad S_f = \sum_b F{b,f}. $$

That is why this package makes it easy to move between overlapping and non-overlapping bands, and to change how bands are distributed across the frequency axis. The architectural difference lives in the filter bank, not in two separate theories of the model.

Which entry point you should use

Use this When Why
hunterFormsBS.BandSplitRotator You are starting new work or want one separator that can cover BS-style, mel-style, and custom band layouts. This is the unified model entry point.
hunterFormsBS.bs_roformer.BSRoformer You want the familiar non-overlapping BS-style interface or a close comparison with upstream BS-RoFormer code. The constructor keeps BS-oriented defaults and compatibility fields.
hunterFormsBS.mel_band_roformer.MelBandRoformer You want the familiar mel-band interface or a close comparison with upstream mel-band code. The constructor keeps mel-oriented defaults and automatic mel-band construction.

Attention and optional acceleration

flash_attn=True requests PyTorch scaled-dot-product-attention backends when the active device supports that path. sage_attention=True asks downstream Attend blocks to call sageattention.sageattn; install SageAttention manually before enabling it because hunterFormsBS does not install that package.

BandSplitRotator can choose RoPE or PoPE with use_pope, and the same model family exposes value-residual and residual-stream controls for deeper attention experiments without changing entry points.

Custom mask_filter_bank helpers

Most users never need this section. The package already bundles the common lucidrains-style mel-band split as hunterFormsBS.bandSplit.mask_filter_bank_mel_band_default, and the separator constructors use that value automatically for sample_rate=44100, stft_n_fft=2048, and num_bands=60.

If a checkpoint uses a different band layout, pass mask_filter_bank explicitly. For ad-hoc generation, import a function from hunterFormsBS.make_static_mask_filter_bank in Python and call the function from a REPL, notebook, or one-off script. There is intentionally no CLI for this module. librosa is only needed if you call librosa_filters_mel.

  • filter_bank_non_overlapping prints a static non-overlapping band split from freqs_per_bands.
  • librosa_filters_mel prints a static mel-band split using librosa.filters.mel.
  • print_static_mask prints the compact torch.tensor(...) assignment used by the other helpers.

Package map

  • hunterFormsBS.__init__
    • Direct export: BandSplitRotator
    • Purpose: small top-level namespace for the primary separator model.
  • hunterFormsBS.bandSplitRotator
    • Main symbols: BandSplitRotator
    • Purpose: unified separator that can build BS-style, mel-style, or custom band layouts from one model family, with downstream attention, transformer, STFT, mask-estimator, and loss options on the constructor.
  • hunterFormsBS.bs_roformer
    • Main symbols: BSRoformer
    • Purpose: familiar non-overlapping BS-style interface with BS-oriented defaults.
  • hunterFormsBS.mel_band_roformer
    • Main symbols: MelBandRoformer
    • Purpose: familiar mel-band interface with automatic mel-band construction.
  • hunterFormsBS.make_static_mask_filter_bank
    • Main symbols: filter_bank_non_overlapping, librosa_filters_mel, print_static_mask
    • Purpose: ad-hoc helper module that prints paste-ready static mask_filter_bank definitions for custom layouts.
  • hunterFormsBS.bandSplit
    • Main symbols: BandSplit, MaskEstimator, MLP, lossComputation, DEFAULT_FREQS_PER_BANDS
    • Purpose: shared band projection, mask-estimation heads, BS-style default partition, and training-loss helper.
  • hunterFormsBS.attend
    • Main symbols: Attend, Attention, FeedForward, Transformer
    • Purpose: shared attention, feedforward, and transformer building blocks with RoPE / PoPE, PyTorch SDPA, optional SageAttention, value-residual, and residual-stream support.
  • hunterFormsBS.theTypes
    • Main symbols: ParametersComputeLoss, FlashAttentionConfig, ParametersAttention, ParametersSTFT, ParametersTransformer
    • Purpose: typed configuration records used across the package.

Architecture in one sentence

The stable separator path is

raw audio → STFT → band gathering → BandSplit → hierarchical attention → MaskEstimator → mask

followed by overlap-aware mask averaging when needed, complex masking in the STFT domain, and inverse STFT reconstruction back to waveform audio.

Top-level exports

The top-level package namespace currently re-exports the primary model that new users most often need:

  • BandSplitRotator

The compatibility classes are intentionally available from their own modules so that imports can stay explicit during comparisons with upstream repos.

Ad-hoc helpers such as hunterFormsBS.make_static_mask_filter_bank stay as explicit submodule imports so the main namespace remains small and optional dependencies stay optional.

Reference materials

Gaussian Error Linear Units (GELUs)

Language Modeling with Gated Convolutional Networks

Attention Is All You Need

(^Which is why there are no other papers on this list.)

Root Mean Square Layer Normalization

RoFormer: Enhanced Transformer with Rotary Position Embedding

XCiT: Cross-Covariance Image Transformers

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization

SageAttention2++: A More Efficient Implementation of SageAttention2

Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation

Music Source Separation with Band-Split RoPE Transformer

Mel-RoFormer for Vocal Separation and Vocal Melody Transcription

Value Residual Learning

Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings

Packages and documentation

My recovery

Static Badge YouTube Channel Subscribers

CC-BY-NC-4.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hunterformsbs-0.1.6.tar.gz (78.6 kB view details)

Uploaded Source

File details

Details for the file hunterformsbs-0.1.6.tar.gz.

File metadata

  • Download URL: hunterformsbs-0.1.6.tar.gz
  • Upload date:
  • Size: 78.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hunterformsbs-0.1.6.tar.gz
Algorithm Hash digest
SHA256 ad1f89377af21028e61616954fd8c8bcbda6bfdfe05d40ce1030902a913d5841
MD5 63b252edae2a8db093fb0316d66c68d2
BLAKE2b-256 e4cac6300bfafa57500d1869d9bd06ecbeb3c7bdac2eda3fa1a96e5618f8c02b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page