Skip to main content

Differentiable Critical Bandwidth: Silverman's modality test as a differentiable PyTorch layer with IFT backward pass.

Project description

DCB — Differentiable Critical Bandwidth

PyPI License: MIT Python 3.9+

A PyTorch package that makes Silverman's critical bandwidth test (1981) fully differentiable, enabling end-to-end gradient-based optimization over the modal structure of continuous distributions.

Overview

The critical bandwidth h_crit is the minimum KDE bandwidth at which a distribution appears to have at most m modes — a classical nonparametric statistic for modality testing. DCB replaces every non-differentiable operation in its computation with a smooth surrogate, then uses the Implicit Function Theorem to compute exact gradients through the root-finding step at O(1) memory cost.

import torch
from dcb import DCBLayer

X = torch.randn(1000, requires_grad=True)   # 1D samples
layer = DCBLayer(target_modes=1)
h_crit = layer(X)                           # differentiable scalar
h_crit.backward()                           # exact IFT gradients

Installation

pip install diffcb

Or from source:

git clone https://github.com/ryZhangHason/differentiable-critical-bandwidth
cd differentiable-critical-bandwidth
pip install -e ".[dev]"

Accuracy vs R's bw.crit

DCB is validated against R's multimode::bw.crit(data, mod0=1) — the standard reference implementation of Hall & York (2001). On identical data:

n DCB vs R (same sample) DCB vs R (independent samples)
100K 0.004% ~0.5% (MC noise from independent RNG)
1M 0.005% ~0.2%
10M 0.004% ~0.1%

The independent-sample figures reflect natural sampling variability (two unbiased estimators drawing different data), not algorithmic error. On identical data, DCB agrees with R to within 0.005% at all tested n. DCB is 43× faster than R at n=100M (1.1 s vs 50 s) and handles n=2B in 24 s while R OOMs.

Key Parameters

DCBLayer(
    target_modes=1,       # target number of modes
    G=512,                # IFT evaluation grid points
    use_fft=True,         # FFT forward (default); eliminates subsampling bias for n>50K
    max_n_exact=1_000_000,# sketch to sketch_size when n exceeds this (None = always exact)
    sketch_size=500_000,  # sketch target; 500K matches full-n accuracy (O(n^{-2/9}) rate)
    safe_backward=False,  # clamp IFT denominator near bifurcations
)

Confirmed Experimental Results

All GPU results produced on Kaggle (T4 / P100) — see experiments/ and outputs/.

Experiment Result Criterion
Accuracy vs R (same data, n=100K) 0.004% < 0.01% ✓
Validation (m≥2, Marron-Wand) R²=0.91, MAE=0.07, ρ=0.89 R²≥0.85 ✓
Speedup vs scipy (CUDA T4, n=8192) 10.5× ≥3× ✓
GAN mode preservation h_crit=1.232 >> 0.3 h_crit>0.3 ✓
Anomaly AUC (KDDCup99) DCB=0.9982 vs IF=0.9867 DCB≥IF ✓

Changelog

v0.1.1 (2026-05-29)

  • MPS fix: torch.histc on MPS allocated an n×bins intermediate (OOM at n≥5M). Replaced with bucketize+bincount on CPU — MPS-safe and numerically identical.
  • Sketch API: DCBLayer(max_n_exact=1_000_000, sketch_size=500_000) — silently sketches to 500K when n exceeds threshold. Justified by O(n⁻²/⁹) convergence of h_crit; 500K sketch matches full-n accuracy.
  • Consistent bisection domain: Pre-computed domain passed to all fft_mode_count calls in a single bisection, eliminating per-step drift.
  • Bias warning direction: Corrected "expected upward bias" to "expected downward bias" on legacy use_fft=False path.
  • Test fixes: Updated 8 pre-existing test failures (tuple unpacking, bounds, deprecation API).

v0.1.0 (2026-05-28)

  • Initial PyPI release. FFT forward (O(n + G log G)), IFT backward, MPS support.

Repository Structure

dcb/            Core PyTorch package
  layer.py        DCBLayer nn.Module + DCBFunction autograd
  solver.py       IFT root-finder and backward pass
  fft_kde.py      FFT-based mode counter (MPS-safe, float64, G=16384)
  kde.py          Direct KDE derivatives (small-n path)
  utils.py        Grid, Silverman bandwidth, sg() stabilizer
experiments/    Reproduction scripts for all paper figures and tables
  phase1_*.py     Validation, speedup, ablation (Figures 1–2, S1–S2)
  phase2_gan.py   GAN mode-collapse prevention (Figure 3)
  phase3_anomaly.py  Anomaly detection (Table 2, Figure 5)
  round20_*.py    Large-n R comparison and streaming benchmarks
  round21_*.py    Accuracy improvement experiments
tests/          Unit tests (pytest, 45 passed, 1 xfailed)
outputs/        All generated figures and tables (PDFs, PNGs, CSVs)

Reproducing Paper Results

# Phase 1: validation, speedup, ablation
python experiments/phase1_validation.py
python experiments/phase1_speedup.py

# Phase 2: GAN mode collapse experiment
python experiments/phase2_gan.py

# Phase 3: anomaly detection benchmark
python experiments/phase3_anomaly.py

For GPU runs use the Kaggle kernels:

  • Phase 1–2: hsingle/dcb-full-experiments
  • Phase 3: hsingle/dcb-phase-3-anomaly-detection

Paper

Ruiyu Zhang. "Differentiable Critical Bandwidth: Making Silverman's Modality Test End-to-End Trainable." Journal of Machine Learning Research, 2026 (in preparation).

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffcb-0.1.3.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diffcb-0.1.3-py3-none-any.whl (30.5 kB view details)

Uploaded Python 3

File details

Details for the file diffcb-0.1.3.tar.gz.

File metadata

  • Download URL: diffcb-0.1.3.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for diffcb-0.1.3.tar.gz
Algorithm Hash digest
SHA256 793b3367c38a07a29248f75426d48b3d8dab0b3b78aa5b451d95d6a833fd4a61
MD5 ac8651a3dc4f58060b679ee80d087cd7
BLAKE2b-256 3e5cb42015546e2b5cea63caaac7f27eabf65c2eaeba091e8b5b8481f76e95bd

See more details on using hashes here.

File details

Details for the file diffcb-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: diffcb-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 30.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for diffcb-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3d2ebe45d6b4e2a4d0aaf2f0661063f1e086aa0dd3ea241ef3a6bcd3157079ac
MD5 560ab2b9a81ea89fb0c0fd20fb6fbe4f
BLAKE2b-256 39f8dd7492f26b8fda92385916d27384200ce7fc34bea1e13816149d38fb6ab4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page