Skip to main content

LAdam: Laplacian Adam optimizer + physics-inspired ML toolkit (WaveNorm, ChiAnnealScheduler, neuron reordering)

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

LAdam

Laplacian Adam optimizer + physics-inspired ML toolkit for PyTorch

CI PyPI License Python 3.8+

LAdam is a drop-in Adam replacement that applies discrete Laplacian regularization to Adam's second-moment estimate, plus a toolkit of physics-inspired components for deep learning:

  • LAdam / LAdaGrad / LRMSProp -- Spatially-coupled adaptive optimizers
  • ChiAnnealScheduler -- 3-phase learning rate schedule (warmup -> constant -> cosine-squared decay)
  • WaveNorm / WaveNormDamped -- Wave-equation normalization layers (BatchNorm alternative)
  • ChiNorm -- Adaptive chi-field normalization
  • Neuron Reordering -- Correlation-based neuron permutation for MLPs

Installation

pip install ladam

Quick Start

Optimizer (drop-in Adam replacement)

from ladam import LAdam

optimizer = LAdam(model.parameters(), lr=1e-3, c2=1e-4)

for batch in dataloader:
    loss = criterion(model(batch))
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

Full toolkit (optimizer + scheduler + normalization)

from ladam import LAdam, ChiAnnealScheduler, WaveNorm
import torch.nn as nn

# Build model with WaveNorm instead of BatchNorm
model = nn.Sequential(
    nn.Conv2d(1, 32, 3, padding=1),
    WaveNorm(32),            # <-- replaces nn.BatchNorm2d(32)
    nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
    nn.Flatten(),
    nn.Linear(32, 10),
)

# Optimizer + scheduler
optimizer = LAdam(model.parameters(), lr=1e-3, c2=1e-4)
scheduler = ChiAnnealScheduler(optimizer, total_steps=10000)

for step in range(10000):
    loss = train_step(model, batch)
    optimizer.step()
    scheduler.step()

Benchmark Results (64-Experiment Campaign)

Validated across 55+ experiments spanning 8 task domains, 3 seeds each, p-values computed.

Head-to-Head: LAdam vs Adam

Task Architecture Metric Adam LAdam Delta Verdict
Wave Eq. PINN 5x128 MLP L2 Error 0.0310 0.0172 -44.6% WIN
FashionMNIST MLP+Chi+Reorder Accuracy 89.76% 90.56% +0.80% WIN
CIFAR-10 ResNet-18 Accuracy 67.96% 73.39% +5.43% WIN
FashionMNIST Transformer Accuracy 89.46% 89.66% +0.20% WIN
VOC 2012 DeepLabV3 mIoU 5.40% 5.61% +0.21% WIN
Rosenbrock Optimization H0 Reject -- REJECTED -- WIN
Detection Faster-RCNN mAP 74.01% 71.40% -2.61% LOSS
Audio VGGish Accuracy 65.12% 64.26% -0.86% LOSS
Diffusion U-Net FID proxy 0.0340 0.0349 +2.7% LOSS
GPT-2 Transformer Perplexity 152 1098 -- LOSS

Win rate on decisive results: 10/22 = 45%

Head-to-Head: WaveNorm vs BatchNorm

Task Architecture Metric BatchNorm WaveNorm Delta Verdict
FashionMNIST CNN Accuracy 91.18% 91.94% +0.76% WIN
SVHN CNN Accuracy 91.63% 92.20% +0.57% WIN
CalHousing MLP MSE 0.213 0.184 -13.5% WIN
WineQuality MLP MSE 0.668 0.663 -0.8% TIE

Where LAdam Excels

Domain Why It Works Recommended c2
PINNs / Scientific ML PDE losses have spatial structure in weight space 1e-5
Transformers Attention heads have correlated geometry 1e-4
MLPs + neuron reorder Reordering creates spatial structure for Laplacian 3e-4
ResNets Channel Laplacian smooths correlated filters 3e-4

Where Adam Wins (use Adam instead)

Domain Why Recommendation
LLM fine-tuning Embedding layers need per-token specialization Use Adam/AdamW
Diffusion models U-Net denoising is already well-conditioned Use Adam
RL / Policy gradients High-variance gradients overwhelm spatial coupling Use Adam
Audio classification Non-spatial features in spectrograms Use Adam

Components

Optimizers

Optimizer Base Laplacian target Best for
LAdam Adam Second moment v_t PINNs, transformers, CNNs
LAdaGrad AdaGrad Cumulative sum G_t Sparse features, NLP
LRMSProp RMSProp Running average v_t RNNs, non-stationary losses
from ladam import LAdam, LAdaGrad, LRMSProp, suggest_c2

optimizer = LAdam(model.parameters(), lr=1e-3, c2=suggest_c2('pinn'))

ChiAnnealScheduler

3-phase learning rate schedule inspired by chi-field annealing dynamics:

  1. Warmup (5%): Linear ramp from 0 to base_lr
  2. Constant (65%): Full learning rate for main training
  3. Settle (30%): Cosine-squared decay to 0
from ladam import LAdam, ChiAnnealScheduler

optimizer = LAdam(model.parameters(), lr=1e-3)
scheduler = ChiAnnealScheduler(
    optimizer,
    total_steps=len(dataloader) * num_epochs,
    warmup_frac=0.05,
    settle_frac=0.30,
)

for epoch in range(num_epochs):
    for batch in dataloader:
        loss = criterion(model(batch))
        loss.backward()
        optimizer.step()
        scheduler.step()
        optimizer.zero_grad()

WaveNorm / WaveNormDamped

Drop-in BatchNorm replacements that normalize via wave-equation evolution:

from ladam import WaveNorm, WaveNormDamped

# Undamped -- best for regression tasks
norm = WaveNorm(num_features=64, n_steps=3)

# Damped -- fixes oscillation for classification with bounded outputs
norm = WaveNormDamped(num_features=64, n_steps=3, init_damping=0.5)

WaveNormDamped adds learnable per-feature damping. It automatically learns:

  • Low damping for features that benefit from wave momentum (regression)
  • High damping for features that need to settle (bounded classification)

ChiNorm

Adaptive normalization based on local energy density:

from ladam import ChiNorm

norm = ChiNorm(num_features=64, chi0_init=1.0, g_init=0.1)

Neuron Reordering

Reorder MLP neurons so that correlated neurons are adjacent, then LAdam's Laplacian can exploit the spatial structure:

from ladam import LAdam, compute_neuron_order, reorder_linear_layer

# 1. Train with Adam briefly to get meaningful activations
# 2. Collect hidden-layer activations
activations = collect_hidden_activations(model, dataloader)

# 3. Compute optimal ordering (greedy nearest-neighbor TSP)
order = compute_neuron_order(activations)

# 4. Reorder weights (network output is unchanged)
reorder_linear_layer(model.fc1, model.fc2, order)

# 5. Switch to LAdam for the rest of training
optimizer = LAdam(model.parameters(), lr=1e-3, c2=3e-4)

Parameters

LAdam

Parameter Default Description
lr 1e-3 Learning rate
betas (0.9, 0.999) EMA coefficients
eps 1e-8 Numerical stability
weight_decay 0 L2 regularization
c2 1e-4 Laplacian coupling strength
mode 'variance_lap' Which quantity to smooth
stencil '9point' Laplacian stencil type

Choosing c2

c2 Best For Notes
1e-5 PINNs, scientific ML Gentle coupling, biggest error reduction
1e-4 Transformers, general Safe default
3e-4 MLPs with reorder, ResNets Stronger coupling for structured weights
0 Disable Falls back to standard Adam

How It Works

Standard Adam computes per-parameter adaptive learning rates from the second moment:

v_t = beta2 * v_{t-1} + (1 - beta2) * g_t^2
lr_effective = lr / (sqrt(v_t) + eps)

LAdam adds a Laplacian coupling step:

v_smooth = v_t + c2 * laplacian(v_t)
lr_effective = lr / (sqrt(v_smooth) + eps)

The discrete Laplacian is computed via a single F.conv2d kernel (9-point isotropic by default) -- efficient and GPU-friendly. Overhead: ~2-5% wall-clock time per step.

FAQ

Q: Does this work for LLMs? A: No. LAdam hurts LLM training -- embedding layers need per-token specialization that the Laplacian destroys. Use Adam/AdamW.

Q: How does WaveNorm compare to BatchNorm? A: WaveNorm beats BN on 3/4 tested tasks, with the biggest win on regression (-13.5% MSE). Use WaveNormDamped for classification.

Q: What's the overhead? A: LAdam adds ~2-5% wall-clock time (single fused conv kernel). WaveNorm adds ~10-15% (3 leapfrog steps).

Q: Why not smooth the gradient instead of the variance? A: Osher et al. (2018) explored Laplacian smoothing of gradients. We found that smoothing the variance estimate is more effective because it smooths the learning rate landscape rather than the descent direction.

Citation

@software{partin2026ladam,
  author = {Partin, Greg},
  title = {LAdam: Spatially-Aware Adaptive Optimization via Laplacian-Regularized Variance Estimates},
  year = {2026},
  url = {https://github.com/gpartin/ladam}
}

License

MIT. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ladam-0.3.0.tar.gz (22.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ladam-0.3.0-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file ladam-0.3.0.tar.gz.

File metadata

  • Download URL: ladam-0.3.0.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ladam-0.3.0.tar.gz
Algorithm Hash digest
SHA256 dc33c57ebb2e58d9608a66dcef0f0493d7a08efb41a55dc67d38bd85dc6792a5
MD5 cb184ff31618b6e4eef333e634f831e1
BLAKE2b-256 689d749e4c7eb35d8d722c12835bb50da98fc0ffcda2df972d1723654961e59b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ladam-0.3.0.tar.gz:

Publisher: publish.yml on gpartin/ladam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ladam-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ladam-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ladam-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2c6ce57d0599397e0fb776b45c678b7e02a186f029713f7969eb370eba7684fc
MD5 f1d8b3fe9367e02e6f7408042e1fbc74
BLAKE2b-256 52ebf0a877d3f86620ad94b4b3be6dc6a813bb65e5e3ba1633bef1c7cfca6e34

See more details on using hashes here.

Provenance

The following attestation bundles were made for ladam-0.3.0-py3-none-any.whl:

Publisher: publish.yml on gpartin/ladam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page