Skip to main content

Golden Pendulum MTL: Anti-resonance equilibria for gradient balancing in multi-task learning

Project description

Golden Pendulum MTL

PyPI version License Python Tests

Anti-resonance equilibria for gradient balancing in multi-task learning.

Replace Nash-MTL corner solutions with golden-ratio (φ) weights that prevent harmonic lock-in between competing task gradients.

Standard Nash-MTL allocated 89% of gradient bandwidth to one task while starving others at 3.7%. Golden Pendulum achieves wmin/wmax = 0.24 while maintaining Pareto optimality.

The Problem

Multi-task gradient methods (MGDA, Nash-MTL, PCGrad) suffer from corner solutions: when task losses have disparate magnitudes, the optimizer converges to simplex vertices where one task dominates.

Method wmin/wmax Max Weight Corner?
Equal Weights 1.00 0.25 No (but ignores conflicts)
Nash-MTL 0.04 0.89 Yes
GradNorm ~0.10 ~0.60 Sometimes
Golden Pendulum 0.24 0.45 No

The Solution

Golden Pendulum MTL derives from the physics of coupled wave oscillators that the stable equilibrium lies at golden-ratio-spaced points (φ = (1+√5)/2 ≈ 1.618), not at simplex corners. These weights are maximally incommensurate — no pair has a rational ratio — preventing the harmonic resonances that cause lock-in.

Algorithm (3 lines to integrate):

from golden_pendulum import GoldenPendulumMTL

balancer = GoldenPendulumMTL(n_tasks=4, lam=0.5)

# In your training loop (replaces loss.backward()):
weights = balancer.backward(losses, model)
optimizer.step()

Installation

pip install golden-pendulum-mtl

Or from source:

git clone https://github.com/Zynerji/GoldenPendulumMTL.git
cd GoldenPendulumMTL
pip install -e ".[dev]"

Quick Start

import torch
import torch.nn as nn
from golden_pendulum import GoldenPendulumMTL

# Your multi-task model
model = YourMultiHeadModel()
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
balancer = GoldenPendulumMTL(n_tasks=3, lam=0.5)

for batch in dataloader:
    optimizer.zero_grad()

    # Compute per-task losses (can have 300x magnitude disparity!)
    losses = {
        "ranking": ranking_loss,       # ~200
        "classification": cls_loss,    # ~0.69
        "regression": reg_loss,        # ~0.01
    }

    # Golden Pendulum backward (replaces loss.backward())
    weights = balancer.backward(losses, model)
    optimizer.step()

    # Monitor balance (should be >0.20, not 0.04 like Nash-MTL)
    print(f"Balance: {balancer.weight_balance_ratio:.3f}")

How It Works

Algorithm 1: Golden Pendulum MTL

  1. Compute per-task gradients gk = ∇θ Lk
  2. Normalize each gradient: ĝk = gk / ||gk||2 (removes magnitude disparity)
  3. Compute scale-free Gram matrix: ĜTĜ
  4. Solve golden-ratio QP (25 iterations of projected gradient descent):
    min  alpha^T G_hat^T G_hat alpha  +  lambda * ||alpha - alpha_golden||_1
    
    where alphagolden = [φk-1 / Σφj-1] are golden-ratio target weights
  5. PCGrad conflict resolution on normalized gradients
  6. Set gradient = weighted sum of conflict-resolved gradients

Why Golden Ratio?

The golden ratio φ is the most irrational number: its continued fraction [1;1,1,1,...] converges slower than any other. This means:

  • No pair of φ-spaced weights has a rational ratio
  • Gradient updates are quasiperiodic (not periodic)
  • No task can "pump" energy from others through resonance

This is the same reason φ appears in phyllotaxis (sunflower seeds), quasicrystals, and the KAM theorem for orbital stability.

Golden-Ratio Weights for K Tasks

K Weights wmin/wmax
2 (0.382, 0.618) 0.618
3 (0.186, 0.302, 0.488) 0.382
4 (0.106, 0.171, 0.276, 0.447) 0.237
8 (0.019, 0.031, ..., 0.277) 0.069
16 (0.001, 0.001, ..., 0.172) 0.004

Framework Integration

PyTorch Lightning

from golden_pendulum import GoldenPendulumCallback

class MyModel(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.automatic_optimization = False
        self.golden = GoldenPendulumCallback(lam=0.5)

    def training_step(self, batch, batch_idx):
        losses = {"task_a": loss_a, "task_b": loss_b}
        opt = self.optimizers()
        opt.zero_grad()
        weights = self.golden.on_train_batch(losses, self, batch_idx)
        opt.step()

Hugging Face Transformers

from golden_pendulum import GoldenPendulumMTL

balancer = GoldenPendulumMTL(n_tasks=3, lam=0.5)

# In your custom Trainer.training_step:
weights = balancer.backward(losses, model)

Weight Logging

from golden_pendulum import GoldenPendulumMTL, WeightLogger

logger = WeightLogger(log_file="weights.jsonl", log_every=100)
balancer = GoldenPendulumMTL(n_tasks=4)

for step, batch in enumerate(loader):
    weights = balancer.backward(losses, model)
    logger.log(step, weights)

API Reference

GoldenPendulumMTL(n_tasks, lam, n_iter, min_weight_fraction, pcgrad)

Parameter Default Description
n_tasks 0 Expected number of tasks (0 = any)
lam 0.5 Golden-ratio regularization strength
n_iter 25 QP solver iterations
min_weight_fraction 0.02 Minimum weight = fraction / K
pcgrad True Enable PCGrad conflict resolution

Methods:

  • backward(losses, model)Dict[str, float] task weights
  • weight_balance_ratio → wmin/wmax
  • mean_weights(last_n) → mean weights over last N steps
  • golden_targets → target golden-ratio weights

golden_nash_backward(losses, model, lam, n_iter, min_weight_fraction, pcgrad)

Functional API — same algorithm, no state tracking.

golden_ratio_weights(n_tasks)

Returns the φ-spaced target weights for K tasks.

Pro Features

Advanced capabilities for production multi-task training.

AdaptiveLambda — Auto-tune regularization

No more manual lambda tuning. Adapts based on real-time gradient conflict severity and loss magnitude disparity.

from golden_pendulum.pro import AdaptiveLambda

adaptive = AdaptiveLambda(lam_init=0.5, lam_min=0.05, lam_max=2.0)

for step, batch in enumerate(loader):
    optimizer.zero_grad()
    losses = compute_losses(model, batch)
    weights = adaptive.backward(losses, model)
    optimizer.step()
    # Lambda auto-adjusts: high conflict -> stronger regularization

CurriculumScheduler — Multi-phase training

Manages Phase A/B/C/D training with automatic backbone freeze/unfreeze.

from golden_pendulum.pro import CurriculumScheduler, Phase

curriculum = CurriculumScheduler(
    phases=[
        Phase("A_ranking", tasks={"returns", "rank", "quality", "embed"},
              steps=15000, freeze_backbone=False, lam=0.5, lr=5e-5),
        Phase("B_risk", tasks={"vol", "mae", "kelly", "risk"},
              steps=10000, freeze_backbone=True, lam=0.3, lr=1e-4),
        Phase("C_meta", tasks={"regime", "calibration", "confidence"},
              steps=10000, freeze_backbone=True, lam=0.3, lr=1e-4),
    ],
    backbone_params=lambda model: model.backbone.parameters(),
)

while not curriculum.is_complete:
    weights = curriculum.backward(all_losses, model)
    optimizer.step()
    curriculum.step(model)  # Auto-freezes backbone at phase transitions

DynamicK — Hierarchical task grouping

Group tasks by function, run Golden Pendulum within and across groups. Reduces O(K^2) cost for K=16+ tasks.

from golden_pendulum.pro import DynamicK, TaskGroup

dk = DynamicK(groups=[
    TaskGroup("ranking", tasks={"returns", "rank", "quality"}),
    TaskGroup("risk", tasks={"vol", "mae", "kelly", "risk"}),
    TaskGroup("meta", tasks={"regime", "calibration", "confidence"}),
])
weights = dk.backward(all_16_losses, model)

# Or auto-group by gradient similarity:
dk = DynamicK(auto_group=True, similarity_threshold=0.5)

DiagnosticsEngine — Real-time conflict analysis

Deep visibility into gradient conflicts, resonance detection, and convergence monitoring.

from golden_pendulum.pro import DiagnosticsEngine

diag = DiagnosticsEngine()
report = diag.analyze(losses, model)
print(f"Conflict ratio: {report.conflict_ratio}")
print(f"Norm ratio: {report.norm_ratio}x")
print(f"Conflicting pairs: {report.conflict_pairs}")
if diag.alerts:
    for alert in diag.alerts:
        print(f"ALERT: {alert}")

Presets — Battle-tested configurations

from golden_pendulum.pro import get_preset, list_presets

list_presets()  # finance_4phase, finance_quick, nlp_multitask, vision_multitask, robotics_control

preset = get_preset("finance_4phase")  # Paper's exact configuration
scheduler = CurriculumScheduler(phases=preset.phases)

Benchmarks

python benchmarks/compare_methods.py

Reproduces the paper's core finding on synthetic multi-head models.

Paper

Golden Pendulum Multi-Task Learning: Anti-Resonance Equilibria for Gradient Balancing in Multi-Head Transformer Training

Christian Knopp, 2026

See paper/golden_pendulum_mtl.pdf for the full paper.

Key results (42.5M-param, 16-head financial transformer):

Metric AdamW+EMA Nash-MTL Golden Pendulum
GOLD metrics 5 1 7
FAIL metrics 2 6 2
Trading IC 0.012 -0.002 0.033
Decile Monotonicity 0.576 0.455 0.879
Vol Forecast IC 0.645 -0.795 0.695
Balance (wmin/wmax) 1.00 0.04 0.24

Citation

@article{knopp2026golden,
  title={Golden Pendulum Multi-Task Learning: Anti-Resonance Equilibria for
         Gradient Balancing in Multi-Head Transformer Training},
  author={Knopp, Christian},
  year={2026}
}

Author

Christian Knopp@Conceptual1cknopp@gmail.com

License

Apache 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

golden_pendulum_mtl-0.2.0.tar.gz (37.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

golden_pendulum_mtl-0.2.0-py3-none-any.whl (32.8 kB view details)

Uploaded Python 3

File details

Details for the file golden_pendulum_mtl-0.2.0.tar.gz.

File metadata

  • Download URL: golden_pendulum_mtl-0.2.0.tar.gz
  • Upload date:
  • Size: 37.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for golden_pendulum_mtl-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f92e761c3cdb48ab75aebc60d52e9dbafafc7bcf910bcacd0a2c1448d4a1fd5a
MD5 3ca4b92ee59550993d4074bd474e7219
BLAKE2b-256 46d6bffa6d86ce998bedf536247e766d86276f05db49f7462af59c3d898ab081

See more details on using hashes here.

File details

Details for the file golden_pendulum_mtl-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for golden_pendulum_mtl-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 df4b6edda8308d6b405f802fc84a3395ccc1161857ecc927bf14e38825ddfce5
MD5 b87f4d45d2061cda39896c51569d6073
BLAKE2b-256 7aefa4ffff1bae00893eb5296acfe5365d9c2fad2de4e81be13359a0cafb6350

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page