Slim implementation of the AdaBelief optimizer in Pytorch
Project description
AdaBelief Slim
This repository contains the code for the adabelief-slim
Python package, from which you can use a Pytorch implementation of the AdaBelief optimizer.
Installation
Using Python 3.6 or higher:
pip install adabelief-slim
Usage
from adabelief import AdaBelief
model = ...
kwargs = ...
optimizer = AdaBelief(model.parameters(), **kwargs)
The following hyperparameters can be passed as keyword arguments:
lr
: learning rate (default:1e-3
)betas
: 2-tuple of coefficients used for computing the running averages of the gradient and its "variance" (see paper) (default:(0.9, 0.999)
)eps
: term added to the denominator to improve numerical stability (default:1e-8
)weight_decay
: weight decay coefficient (default:1e-2
)amsgrad
: whether to use the AMSGrad variant of the algorithm (default:False
)rectify
: whether to use the RAdam variant of the algorithm (default:False
)weight_decouple
: whether to use the AdamW variant of this algorithm (default:True
)
Be aware that the AMSGrad and RAdam variants can't be used simultaneously.
Motivation
As you're probably aware, one of the paper's main authors (Juntang Zhuang) released his code in this repository, which is used to maintain the adabelief_pytorch
package. Thus, you may be wondering why this repository exists, and how it differs with his. The reason is actually pretty simple: the author made some decisions regarding his code which made it an unsuitable option for me. While it wasn't the only thing that bugged me, my main issue was with adding unnecessary packages as dependencies.
Regarding differences, the main ones are:
- I removed the
fixed_decay
option, as the author's experiments showed it wasn't great - I removed the
degenerate_to_sgd
option, as the author copied the RAdam codebase, but it seems recommended to always use it - I removed all logging related features, along with the
print_change_log
option - I removed all code specific to older version of Pytorch (I think all versions above
1.4
should work), as I don't care for them - I changed the flow of the code to be closer to the official implementation of AdamW
- I removed all usage of the
.data
property as it isn't recommended, and can be avoided with thetorch.no_grad
decorator - I moved the code specific to AMSGrad so that it isn't executed if the RAdam variant is selected
- I added an exception if both RAdam and AMSGrad are selected, as they can't both be used (in the official repository RAdam is used if both RAdam and AMSGrad are selected)
- I removed half-precision support, as I don't care for it
References
Codebases
- Official AdaBelief implementation
- Official RAdam implementation
- Official AdamW implementation
- Pytorch Optimizers
Papers
- Adam: A Method for Stochastic Optimization: proposed Adam
- Decoupled Weight Decay Regularization: proposed AdamW
- On the Convergence of Adam and Beyond: proposed AMSGrad
- On the Variance of the Adaptive Learning Rate and Beyond: proposed RAdam
- AdaBelief Optimizer, adapting stepsizes by the belief in observed gradients: proposed AdaBelief
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file adabelief-slim-0.0.1.tar.gz
.
File metadata
- Download URL: adabelief-slim-0.0.1.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.1.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3656e7f0c82b29bee3857daa76717fbbe2925c5239ff968bc9a75cab4adf2d4 |
|
MD5 | 2a43fb5e27c6b74dfa058809be77477e |
|
BLAKE2b-256 | ddc763943d3f63ee6084419a2a74c2fb87daf8ff828a880413ffd4b10f82bc49 |
File details
Details for the file adabelief_slim-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: adabelief_slim-0.0.1-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.1.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | edd6b3bab3b3461d3ad8bde99a52df404ecb1f1eee7d86c58597f7974ef4c5b7 |
|
MD5 | a537e1d634bb053771cdcf03a6533871 |
|
BLAKE2b-256 | ef1ae0e08f2d6467bc255295a032745fdbdce45beab1d32e489dd6a7762afcc5 |