Skip to main content

PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training

Project description

PILOT

PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training

PyPI arXiv License Python PyTorch

Project Page  |  Paper  |  PyPI


PILOT is an online adaptive optimizer that adjusts its update behavior during training. Instead of applying a fixed update rule from the first step to the last, PILOT reads a gradient-direction agreement signal and reshapes the update through a lightweight learned policy — no offline search, no meta-training, no second-order estimation.

Installation

pip install pilot-optimizer
Or install from source
git clone https://github.com/SattamAltwaim/PILOT.git
cd PILOT
pip install -e .

Usage

from pilot import PILOT

optimizer = PILOT(
    model.parameters(),
    lr=1e-3,
    betas=(0.9, 0.999),
    weight_decay=1e-4,
    gamma=0.95,        # smoothing for agreement signal
    eta_phi=0.01,      # policy learning rate
    degree=2           # polynomial degree
)

for batch in dataloader:
    loss = criterion(model(x), y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

Key Results

CNN Architecture

Dataset Optimizer Accuracy (%) ↑ Val Loss ↓ Loss Var. ↓
FashionMNIST Adam 93.28 0.1957 0.0033
FashionMNIST AdamW 93.22 0.1944 0.0034
FashionMNIST Lion 92.91 0.2091 0.0041
FashionMNIST AdaBelief 93.66 0.1822 0.0046
FashionMNIST PILOT (Ours) 94.13 0.1719 0.0045
CIFAR-10 Adam 79.91 0.5794 0.0103
CIFAR-10 Lion 80.87 0.5487 0.0105
CIFAR-10 PILOT (Ours) 81.94 0.5302 0.0073

ResNet-18 Architecture

Dataset Optimizer Accuracy (%) ↑ Val Loss ↓ Loss Var. ↓
FashionMNIST AdaBelief 95.33 0.1711 0.0056
FashionMNIST PILOT (Ours) 95.71 0.2690 0.0030
CIFAR-10 Adam 93.18 0.2140 0.0073
CIFAR-10 AdamW 92.90 0.2514 0.0066
CIFAR-10 PILOT (Ours) 93.42 0.2496 0.0001

Full results with all baselines in the paper.

Training Curves

Training loss and validation accuracy

Training loss (left, middle) and validation accuracy (right) across FashionMNIST (top) and CIFAR-10 (bottom) over 30 epochs.


Loss Landscape

Loss landscape trajectories — CIFAR-10 / SmallCNN

PILOT follows a distinct trajectory through the loss surface and converges to a lower-loss region compared to Adam, AdamW, Lion, and Sophia.


Method

PILOT monitors gradient-direction agreement (cosine similarity between successive gradients, smoothed into a running signal) and feeds it through a learned polynomial to produce three control knobs:

  • Momentum reliance — trust the accumulated trend vs. react to the current gradient
  • Variance normalization — how aggressively to apply adaptive scaling
  • Sign compression — use full gradient magnitudes vs. compress toward ±1

Only 3(d+1) learnable coefficients (9 for degree 2). Initialized so PILOT starts as Adam and learns to deviate. The policy is updated each step via analytic meta-gradients at negligible cost. See the paper for the full derivation.


Hyperparameters

Parameter Description Default / Range
lr Learning rate 1e-3
betas Moment coefficients (0.9, 0.999)
weight_decay Decoupled weight decay 0.01
gamma Agreement signal smoothing 0.850.99
eta_phi Policy learning rate 5e-45e-2
degree Polynomial degree 14
Tuned configurations from the paper
Dataset Architecture γ η_φ Degree
CIFAR-10 SmallCNN 0.882 0.00312 1
CIFAR-10 ResNet-18 0.950 0.00500 2
FashionMNIST SmallCNN 0.950 0.01000 2
FashionMNIST ResNet-18 0.957 0.00273 3

Selected via Bayesian optimization (TPE + ASHA early stopping, 30–40 trials).


Experiment Setup

30 epochs · cross-entropy loss · cosine annealing LR · batch size 128 · AMP · 3-epoch linear warmup for ResNet-18. Benchmarked on NVIDIA V100 GPUs. See the paper for full details.


Citation

@misc{altuuaim2026pilotpolicyinformedlearnedoptimization,
      title={PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training}, 
      author={Sattam Altuuaim and Lama Ayash and Muhammad Mubashar and Naeemullah Khan},
      year={2026},
      eprint={2605.24570},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2605.24570}, 
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pilot_optimizer-0.1.0.post3.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pilot_optimizer-0.1.0.post3-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file pilot_optimizer-0.1.0.post3.tar.gz.

File metadata

  • Download URL: pilot_optimizer-0.1.0.post3.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for pilot_optimizer-0.1.0.post3.tar.gz
Algorithm Hash digest
SHA256 bd9c6a8bcd1b08f490445d14e7f8aed0cb18a126f8c174eac299a7919150ac1d
MD5 75984f207c13fc73467dc6a1f3491795
BLAKE2b-256 70a5ff0d3020b8e3716482cc1bb12e1576d29f4b840f8292382c10fc57cfdcc5

See more details on using hashes here.

File details

Details for the file pilot_optimizer-0.1.0.post3-py3-none-any.whl.

File metadata

File hashes

Hashes for pilot_optimizer-0.1.0.post3-py3-none-any.whl
Algorithm Hash digest
SHA256 c43647d3a0aa15382119ffacbe0da5d4f85b43ca5c3c72ff0efabebeb7f409b1
MD5 4fc5ddaf1e136daf4ee840a4a6f4c540
BLAKE2b-256 0a491545058669e7080fe7c51be5f132fd49fe24ea3a5c47a33d49fef79e6dc2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page