Bunch of optimizer implementations in PyTorch with clean-code, strict types. Also, including useful optimization ideas.
Project description
Build |
|
Quality |
|
Package |
|
Status |
Documentation
Usage
Install
$ pip3 install pytorch-optimizer
Simple Usage
from pytorch_optimizer import AdamP ... model = YourModel() optimizer = AdamP(model.parameters()) ...
or you can use optimizer loader, simply passing a name of the optimizer.
from pytorch_optimizer import load_optimizer ... model = YourModel() opt = load_optimizer(optimizer='adamp') optimizer = opt(model.parameters()) ...
Supported Optimizers
Optimizer |
Description |
Official Code |
Paper |
---|---|---|---|
AdaBelief |
Adapting Step-sizes by the Belief in Observed Gradients |
||
AdaBound |
Adaptive Gradient Methods with Dynamic Bound of Learning Rate |
||
AdaHessian |
An Adaptive Second Order Optimizer for Machine Learning |
||
AdamD |
Improved bias-correction in Adam |
||
AdamP |
Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights |
||
diffGrad |
An Optimization Method for Convolutional Neural Networks |
||
MADGRAD |
A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic |
||
RAdam |
On the Variance of the Adaptive Learning Rate and Beyond |
||
Ranger |
a synergistic optimizer combining RAdam and LookAhead, and now GC in one optimizer |
||
Ranger21 |
a synergistic deep learning optimizer |
||
Lamb |
Large Batch Optimization for Deep Learning |
||
Shampoo |
Preconditioned Stochastic Tensor Optimization |
||
Nero |
Learning by Turning: Neural Architecture Aware Optimisation |
||
Adan |
Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models |
Useful Resources
Several optimization ideas to regularize & stabilize the training. Most of the ideas are applied in Ranger21 optimizer.
Also, most of the captures are taken from Ranger21 paper.
Adaptive Gradient Clipping
Gradient Centralization
Gradient Centralization (GC) operates directly on gradients by centralizing the gradient to have zero mean.
Softplus Transformation
By running the final variance denom through the softplus function, it lifts extremely tiny values to keep them viable.
paper : arXiv
Gradient Normalization
Norm Loss
paper : arXiv
Positive-Negative Momentum
Linear learning rate warmup
paper : arXiv
Stable weight decay
Explore-exploit learning rate schedule
Lookahead
Chebyshev learning rate schedule
Acceleration via Fractal Learning Rate Schedules
paper : arXiv
(Adaptive) Sharpness-Aware Minimization
On the Convergence of Adam and Beyond
paper : paper
Gradient Surgery for Multi-Task Learning
paper : paper
Citations
Adaptive Gradient Clipping (AGC)
Explore-Exploit Learning Rate Schedule
On the adequacy of untuned warmup for adaptive optimization
Stable weight decay regularization
Adaptive Sharpness-aware minimization
On the Convergence of Adam and Beyond
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pytorch_optimizer-1.3.1.tar.gz
.
File metadata
- Download URL: pytorch_optimizer-1.3.1.tar.gz
- Upload date:
- Size: 40.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.0 CPython/3.9.13 Linux/5.15.0-1017-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e60e04c3f9f05770cbfa49859d99776495a2961c43205c38b4bda74b4977b73 |
|
MD5 | fb2a77d271bd5b859045e2aa185834be |
|
BLAKE2b-256 | 199c73fe71fb888553eb28377bc7d2f2e5b224754e9c9c74a915d76b9600ce85 |
File details
Details for the file pytorch_optimizer-1.3.1-py3-none-any.whl
.
File metadata
- Download URL: pytorch_optimizer-1.3.1-py3-none-any.whl
- Upload date:
- Size: 57.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.0 CPython/3.9.13 Linux/5.15.0-1017-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 107783228c50a8d5fc3da663325b4669d056fe6ec52e24d4f85410a382246ea5 |
|
MD5 | 6f96eea5b4d1d0a57e0681a9d06dbba7 |
|
BLAKE2b-256 | 9713c741272e0bc8286637ca841e9e78519daad43ec42720e81fc30518a1ab1c |