PulseOpt: episodic adaptive control for optimizer dynamics (LR multiplier and gradient noise)

These details have not been verified by PyPI

Project links

Project description

PulseOpt

PulseOpt: episodic adaptive control for optimizer dynamics.

pulseopt wraps any PyTorch optimizer with an episode-level bandit that adapts a learning-rate multiplier and a gradient-noise level online. Instead of committing to one static schedule, it evaluates short training episodes ("pulses"), scores them with a shaped log-loss-improvement reward, and picks the next configuration with a discounted-UCB controller. The underlying method is Adaptive Episodic Exploration Scheduling (AEES), exposed as the AEES class.

It is small, has a single dependency (torch>=2.0), and is designed to drop into an existing training loop with two extra calls per step.

Install

pip install pulseopt

Quick start

import torch
from torch import nn
from pulseopt import AEES

model = nn.Linear(8, 4)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-3)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=1000)

aees = AEES(
    optimizer,
    lr_candidates=[0.5, 1.0, 2.0],   # tried as multipliers on the optimizer's base LR
    noise_candidates=[0.0, 0.005],   # tried as gradient-noise std
    episode_length=50,
    lr_scheduler=scheduler,           # optional — AEES calls .step() for you
    seed=0,
)

for step in range(1000):
    aees.step_start(step)            # selects the candidate for this step
    optimizer.zero_grad()
    loss = model(torch.randn(32, 8)).pow(2).mean()
    loss.backward()
    aees.step_end(loss)              # runs optimizer.step() + scheduler.step()

aees.finalize()
logs = aees.get_logs()
print(f"Episodes run: {len(logs['episode_rewards'])}")
print(f"Last selected LR multiplier: {logs['selected_lr_values'][-1]}")

The wrapper owns optimizer.step() and lr_scheduler.step(); you keep zero_grad() and loss.backward(). The LR multiplier is applied transiently around optimizer.step(), so any external scheduler still advances on the optimizer's base learning rate.

How it works

Episode: a fixed-length window of training steps with one frozen candidate (LR multiplier, noise std).
Reward: log-EMA-loss improvement over the episode, minus an instability penalty proportional to within-episode loss variance, clipped to [-1, 1].
Controller: discounted-UCB by default; an optional bucketed-contextual variant uses a coarse loss-trend (and optional training-phase) bucket to share information across similar regimes.

Axes with a single candidate are treated as fixed constants and get no controller — passing lr_candidates=[1.0] keeps the LR multiplier disabled, and noise_candidates=[0.0] keeps gradient noise off.

Common knobs

Argument	Meaning
`lr_candidates`	Multipliers tried against the optimizer's base LR.
`noise_candidates`	Gradient-noise std values; `0.0` means no noise.
`episode_length`	Steps per episode; reward is computed at episode end.
`lr_scheduler`	Optional `torch.optim.lr_scheduler.*` instance; `step()` is called for you.
`structured_control_mode`	`"independent"` (default) or `"conditional"` (one noise controller per LR arm).
`context_mode`	`"none"` (default), `"trend"`, or `"trend_phase"` (requires `total_training_steps`).
`reward_instability_lambda`	Weight on the variance penalty in the reward.
`seed`	Seeds controllers and gradient-noise generators.

AEES.step_end(loss) raises ValueError on a non-finite loss. If you train with mixed precision (torch.cuda.amp / torch.amp) and expect occasional NaN/Inf during loss-scaling backoff, guard the call yourself or skip the step.

Caveats

AEES does not adapt weight decay; keep it as a normal optimizer hyperparameter.
Each step clones the optimizer's parameters once to compute an update norm for the reward signal. Memory cost is roughly 1× model size.
There is no state_dict / load_state_dict yet — checkpoint and resume are planned for a future minor release.

For researchers / thesis reproduction

The package is the library half of a thesis project. The thesis-facing experiment runners and orchestration helpers live alongside the library in this repository but are not part of the published wheel.

Main experiment scripts:

Key structured AEES flags exposed by the runners:

--lr-candidates, --noise-candidates
--structured-control-mode {independent,conditional}
--context-mode {none,trend,trend_phase}
--context-trend-window, --context-trend-epsilon
--episode-length

Scheduler flags: --lr-scheduler {none,cosine,linear,warmup_linear}, --scheduler-t-max, --warmup-epochs.

Reward flags: --reward-epsilon, --reward-instability-lambda, --reward-clip-min, --reward-clip-max.

CIFAR-specific noise flags: --label-noise-type {none,symmetric,asymmetric}, --label-noise-rate.

The structured path does not adapt weight decay. Single-candidate axes (e.g. --lr-candidates 1.0 or --noise-candidates 0.0) are treated as fixed constants and skip controller creation. The CIFAR runner also exposes --control-mode {baseline,adaptive,random}; SST-2 / AG News use --method {AdamW,AdaptiveScheduler,RandomScheduler}.

Repo layout

src/pulseopt/ — published library (controllers, episode manager, reward, optimizer wrappers, the AEES high-level API).
experiments/ — task runners and orchestration helpers (not packaged).
tests/ — regression and unit tests.
data/, results/ — local datasets and outputs (gitignored).

Development

python3.11 -m venv .venv && source .venv/bin/activate
pip install -e .[dev,experiments]
pytest

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

May 11, 2026

0.2.2

May 10, 2026

0.2.1

May 7, 2026

0.2.0

May 7, 2026

0.1.6rc1 pre-release

May 6, 2026

0.1.5

May 6, 2026

0.1.4

May 5, 2026

0.1.3

May 5, 2026

This version

0.1.2

May 5, 2026

0.1.0

May 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pulseopt-0.1.2.tar.gz (33.0 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pulseopt-0.1.2-py3-none-any.whl (23.8 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file pulseopt-0.1.2.tar.gz.

File metadata

Download URL: pulseopt-0.1.2.tar.gz
Upload date: May 5, 2026
Size: 33.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for pulseopt-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`d302bcd19eb0a2088a5b385b36301221fa06afbd799e3e72eedd7f86f0965138`
MD5	`197b7bc98bb9777b30e2bca77b0e50ab`
BLAKE2b-256	`8b9ee557f2bb3bbf9720fac412a9ad494332bff4f6adee47a4cb927c301c69db`

See more details on using hashes here.

File details

Details for the file pulseopt-0.1.2-py3-none-any.whl.

File metadata

Download URL: pulseopt-0.1.2-py3-none-any.whl
Upload date: May 5, 2026
Size: 23.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for pulseopt-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`98a7a05531462fd9b2aaa610b19e1fea3b9dcc518d5d9151fce4e45356eecff0`
MD5	`fcdef01215e25b67a815e5834c4225cc`
BLAKE2b-256	`9974abdae4e54881e663e9a0d2941d0210cf992e0a45940fc63cb3e477d9eb12`

See more details on using hashes here.

pulseopt 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PulseOpt

Install

Quick start

How it works

Common knobs

Caveats

For researchers / thesis reproduction

Repo layout

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes