Skip to main content

LAdam: Laplacian Adam optimizer + physics-inspired ML toolkit (WaveNorm, ChiAnnealScheduler, neuron reordering)

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

LAdam

Laplacian Adam optimizer + physics-inspired ML toolkit for PyTorch

CI PyPI License Python 3.8+

LAdam is a drop-in Adam replacement that applies discrete Laplacian regularization to Adam's second-moment estimate, plus a toolkit of physics-inspired components for deep learning:

  • LAdam / LAdaGrad / LRMSProp -- Spatially-coupled adaptive optimizers
  • ChiAnnealScheduler -- 3-phase learning rate schedule (warmup -> constant -> cosine-squared decay)
  • WaveNorm / WaveNormDamped -- Wave-equation normalization layers (BatchNorm alternative)
  • ChiNorm -- Adaptive chi-field normalization
  • Neuron Reordering -- Correlation-based neuron permutation for MLPs

Installation

pip install ladam

Quick Start

Optimizer (drop-in Adam replacement)

from ladam import LAdam

optimizer = LAdam(model.parameters(), lr=1e-3, c2=1e-4)

for batch in dataloader:
    loss = criterion(model(batch))
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

Full toolkit (optimizer + scheduler + normalization)

from ladam import LAdam, ChiAnnealScheduler, WaveNorm
import torch.nn as nn

# Build model with WaveNorm instead of BatchNorm
model = nn.Sequential(
    nn.Conv2d(1, 32, 3, padding=1),
    WaveNorm(32),            # <-- replaces nn.BatchNorm2d(32)
    nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
    nn.Flatten(),
    nn.Linear(32, 10),
)

# Optimizer + scheduler
optimizer = LAdam(model.parameters(), lr=1e-3, c2=1e-4)
scheduler = ChiAnnealScheduler(optimizer, total_steps=10000)

for step in range(10000):
    loss = train_step(model, batch)
    optimizer.step()
    scheduler.step()

Benchmark Results (64-Experiment Campaign)

Validated across 55+ experiments spanning 8 task domains, 3 seeds each, p-values computed.

Head-to-Head: LAdam vs Adam

Task Architecture Metric Adam LAdam Delta Verdict
Wave Eq. PINN 5x128 MLP L2 Error 0.0310 0.0172 -44.6% WIN
FashionMNIST MLP+Chi+Reorder Accuracy 89.76% 90.56% +0.80% WIN
CIFAR-10 ResNet-18 Accuracy 67.96% 73.39% +5.43% WIN
FashionMNIST Transformer Accuracy 89.46% 89.66% +0.20% WIN
VOC 2012 DeepLabV3 mIoU 5.40% 5.61% +0.21% WIN
Rosenbrock Optimization H0 Reject -- REJECTED -- WIN
Detection Faster-RCNN mAP 74.01% 71.40% -2.61% LOSS
Audio VGGish Accuracy 65.12% 64.26% -0.86% LOSS
Diffusion U-Net FID proxy 0.0340 0.0349 +2.7% LOSS
GPT-2 Transformer Perplexity 152 1098 -- LOSS

Win rate on decisive results: 10/22 = 45%

Head-to-Head: WaveNorm vs BatchNorm

Task Architecture Metric BatchNorm WaveNorm Delta Verdict
FashionMNIST CNN Accuracy 91.18% 91.94% +0.76% WIN
SVHN CNN Accuracy 91.63% 92.20% +0.57% WIN
CalHousing MLP MSE 0.213 0.184 -13.5% WIN
WineQuality MLP MSE 0.668 0.663 -0.8% TIE

Where LAdam Excels

Domain Why It Works Recommended c2
PINNs / Scientific ML PDE losses have spatial structure in weight space 1e-5
Transformers Attention heads have correlated geometry 1e-4
MLPs + neuron reorder Reordering creates spatial structure for Laplacian 3e-4
ResNets Channel Laplacian smooths correlated filters 3e-4

Where Adam Wins (use Adam instead)

Domain Why Recommendation
LLM fine-tuning Embedding layers need per-token specialization Use Adam/AdamW
Diffusion models U-Net denoising is already well-conditioned Use Adam
RL / Policy gradients High-variance gradients overwhelm spatial coupling Use Adam
Audio classification Non-spatial features in spectrograms Use Adam

Auto-Configuration (New in v0.4.0)

Automatically configure LAdam with optimal per-layer c2 values based on your model architecture:

from ladam import auto_configure, analyze_model
import torch.nn as nn

model = nn.Sequential(
    nn.Linear(784, 256),
    nn.ReLU(),
    nn.Linear(256, 10),
)

# One-line setup — returns (optimizer, scheduler_or_None)
optimizer, scheduler = auto_configure(model, lr=1e-3, total_steps=10000)

# Or just analyze without creating an optimizer
report = analyze_model(model)
print(report)
# {'architecture': 'mlp', 'total_params': 203530, 'recommendation': 'good_fit',
#  'suitable_pct': 99.9, 'c2_map': {'linear': 1e-05, 'conv': 0.0, ...}}

auto_configure assigns per-layer-type c2 values derived from the 64-experiment benchmark:

  • Linear layers: c2=1e-5 (biggest wins in PINNs and MLPs)
  • Conv/Attention/Norm/Embedding: c2=0 (Laplacian coupling not beneficial)
  • Returns a ChiAnnealScheduler when total_steps > 0

Components

Optimizers

Optimizer Base Laplacian target Best for
LAdam Adam Second moment v_t PINNs, transformers, CNNs
LAdaGrad AdaGrad Cumulative sum G_t Sparse features, NLP
LRMSProp RMSProp Running average v_t RNNs, non-stationary losses
from ladam import LAdam, LAdaGrad, LRMSProp, suggest_c2

optimizer = LAdam(model.parameters(), lr=1e-3, c2=suggest_c2('pinn'))

ChiAnnealScheduler

3-phase learning rate schedule inspired by chi-field annealing dynamics:

  1. Warmup (5%): Linear ramp from 0 to base_lr
  2. Constant (65%): Full learning rate for main training
  3. Settle (30%): Cosine-squared decay to 0
from ladam import LAdam, ChiAnnealScheduler

optimizer = LAdam(model.parameters(), lr=1e-3)
scheduler = ChiAnnealScheduler(
    optimizer,
    total_steps=len(dataloader) * num_epochs,
    warmup_frac=0.05,
    settle_frac=0.30,
)

for epoch in range(num_epochs):
    for batch in dataloader:
        loss = criterion(model(batch))
        loss.backward()
        optimizer.step()
        scheduler.step()
        optimizer.zero_grad()

WaveNorm / WaveNormDamped

Drop-in BatchNorm replacements that normalize via wave-equation evolution:

from ladam import WaveNorm, WaveNormDamped

# Undamped -- best for regression tasks
norm = WaveNorm(num_features=64, n_steps=3)

# Damped -- fixes oscillation for classification with bounded outputs
norm = WaveNormDamped(num_features=64, n_steps=3, init_damping=0.5)

WaveNormDamped adds learnable per-feature damping. It automatically learns:

  • Low damping for features that benefit from wave momentum (regression)
  • High damping for features that need to settle (bounded classification)

ChiNorm

Adaptive normalization based on local energy density:

from ladam import ChiNorm

norm = ChiNorm(num_features=64, chi0_init=1.0, g_init=0.1)

Neuron Reordering

Reorder MLP neurons so that correlated neurons are adjacent, then LAdam's Laplacian can exploit the spatial structure:

from ladam import LAdam, compute_neuron_order, reorder_linear_layer

# 1. Train with Adam briefly to get meaningful activations
# 2. Collect hidden-layer activations
activations = collect_hidden_activations(model, dataloader)

# 3. Compute optimal ordering (greedy nearest-neighbor TSP)
order = compute_neuron_order(activations)

# 4. Reorder weights (network output is unchanged)
reorder_linear_layer(model.fc1, model.fc2, order)

# 5. Switch to LAdam for the rest of training
optimizer = LAdam(model.parameters(), lr=1e-3, c2=3e-4)

Parameters

LAdam

Parameter Default Description
lr 1e-3 Learning rate
betas (0.9, 0.999) EMA coefficients
eps 1e-8 Numerical stability
weight_decay 0 L2 regularization
c2 1e-4 Laplacian coupling strength
mode 'variance_lap' Which quantity to smooth
stencil '9point' Laplacian stencil type

Choosing c2

c2 Best For Notes
1e-5 PINNs, scientific ML Gentle coupling, biggest error reduction
1e-4 Transformers, general Safe default
3e-4 MLPs with reorder, ResNets Stronger coupling for structured weights
0 Disable Falls back to standard Adam

How It Works

Standard Adam computes per-parameter adaptive learning rates from the second moment:

v_t = beta2 * v_{t-1} + (1 - beta2) * g_t^2
lr_effective = lr / (sqrt(v_t) + eps)

LAdam adds a Laplacian coupling step:

v_smooth = v_t + c2 * laplacian(v_t)
lr_effective = lr / (sqrt(v_smooth) + eps)

The discrete Laplacian is computed via a single F.conv2d kernel (9-point isotropic by default) -- efficient and GPU-friendly. Overhead: ~2-5% wall-clock time per step.

FAQ

Q: Does this work for LLMs? A: No. LAdam hurts LLM training -- embedding layers need per-token specialization that the Laplacian destroys. Use Adam/AdamW.

Q: How does WaveNorm compare to BatchNorm? A: WaveNorm beats BN on 3/4 tested tasks, with the biggest win on regression (-13.5% MSE). Use WaveNormDamped for classification.

Q: What's the overhead? A: LAdam adds ~2-5% wall-clock time (single fused conv kernel). WaveNorm adds ~10-15% (3 leapfrog steps).

Q: Why not smooth the gradient instead of the variance? A: Osher et al. (2018) explored Laplacian smoothing of gradients. We found that smoothing the variance estimate is more effective because it smooths the learning rate landscape rather than the descent direction.

Citation

@software{partin2026ladam,
  author = {Partin, Greg},
  title = {LAdam: Spatially-Aware Adaptive Optimization via Laplacian-Regularized Variance Estimates},
  year = {2026},
  url = {https://github.com/gpartin/ladam}
}

Support & Sponsorship

LAdam is free and open-source (MIT license). If it helps your research or saves you training time, consider supporting development:

Tier Price What You Get
Community Free Full library, all features, GitHub Issues
Pro Sponsor $29/mo Priority bug fixes, architecture review, direct support
Enterprise Custom Consulting, custom integration, model optimization reports

Sponsor Buy Me a Coffee

License

MIT. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ladam-0.4.0.tar.gz (36.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ladam-0.4.0-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file ladam-0.4.0.tar.gz.

File metadata

  • Download URL: ladam-0.4.0.tar.gz
  • Upload date:
  • Size: 36.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ladam-0.4.0.tar.gz
Algorithm Hash digest
SHA256 d7cdda5a9a47e4b58aa8e7a8284e50e23fec40542fce22fc69baf45e4c8e2af4
MD5 28000f2c4f0096b52aa7babcd2e8acbf
BLAKE2b-256 1ec216b605924b6fc58aaeec2194984e12ac3b96f2291205884d4a13da71cc9f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ladam-0.4.0.tar.gz:

Publisher: publish.yml on gpartin/ladam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ladam-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: ladam-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 20.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ladam-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bae328c6e904d9ebc756ad4c9cbe19d56d63587d5f0e14b7a921ecc9ee9d0f8d
MD5 4da2b99eae5dea2e9f7164d68e84f350
BLAKE2b-256 8cff202e216a570203d92b1cfeb1cb5a76a3832ce28952ae9072973b3cd37658

See more details on using hashes here.

Provenance

The following attestation bundles were made for ladam-0.4.0-py3-none-any.whl:

Publisher: publish.yml on gpartin/ladam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page