Skip to main content

Slim implementation of the AdaBelief optimizer in Pytorch

Project description

AdaBelief Slim

This repository contains the code for the adabelief-slim Python package, from which you can use a Pytorch implementation of the AdaBelief optimizer.

Installation

Using Python 3.6 or higher:

pip install adabelief-slim

Usage

from adabelief import AdaBelief

model = ...
kwargs = ...

optimizer = AdaBelief(model.parameters(), **kwargs)

The following hyperparameters can be passed as keyword arguments:

  • lr: learning rate (default: 1e-3)
  • betas: 2-tuple of coefficients used for computing the running averages of the gradient and its "variance" (see paper) (default: (0.9, 0.999))
  • eps: term added to the denominator to improve numerical stability (default: 1e-8)
  • weight_decay: weight decay coefficient (default: 1e-2)
  • amsgrad: whether to use the AMSGrad variant of the algorithm (default: False)
  • rectify: whether to use the RAdam variant of the algorithm (default: False)
  • weight_decouple: whether to use the AdamW variant of this algorithm (default: True)

Be aware that the AMSGrad and RAdam variants can't be used simultaneously.

Motivation

As you're probably aware, one of the paper's main authors (Juntang Zhuang) released his code in this repository, which is used to maintain the adabelief_pytorch package. Thus, you may be wondering why this repository exists, and how it differs with his. The reason is actually pretty simple: the author made some decisions regarding his code which made it an unsuitable option for me. While it wasn't the only thing that bugged me, my main issue was with adding unnecessary packages as dependencies.

Regarding differences, the main ones are:

  • I removed the fixed_decay option, as the author's experiments showed it wasn't great
  • I removed the degenerate_to_sgd option, as the author copied the RAdam codebase, but it seems recommended to always use it
  • I removed all logging related features, along with the print_change_log option
  • I removed all code specific to older version of Pytorch (I think all versions above 1.4 should work), as I don't care for them
  • I changed the flow of the code to be closer to the official implementation of AdamW
  • I removed all usage of the .data property as it isn't recommended, and can be avoided with the torch.no_grad decorator
  • I moved the code specific to AMSGrad so that it isn't executed if the RAdam variant is selected
  • I added an exception if both RAdam and AMSGrad are selected, as they can't both be used (in the official repository RAdam is used if both RAdam and AMSGrad are selected)
  • I removed half-precision support, as I don't care for it

References

Codebases

Papers

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adabelief-slim-0.0.1.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

adabelief_slim-0.0.1-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file adabelief-slim-0.0.1.tar.gz.

File metadata

  • Download URL: adabelief-slim-0.0.1.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.1.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6

File hashes

Hashes for adabelief-slim-0.0.1.tar.gz
Algorithm Hash digest
SHA256 d3656e7f0c82b29bee3857daa76717fbbe2925c5239ff968bc9a75cab4adf2d4
MD5 2a43fb5e27c6b74dfa058809be77477e
BLAKE2b-256 ddc763943d3f63ee6084419a2a74c2fb87daf8ff828a880413ffd4b10f82bc49

See more details on using hashes here.

File details

Details for the file adabelief_slim-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: adabelief_slim-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.1.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6

File hashes

Hashes for adabelief_slim-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 edd6b3bab3b3461d3ad8bde99a52df404ecb1f1eee7d86c58597f7974ef4c5b7
MD5 a537e1d634bb053771cdcf03a6533871
BLAKE2b-256 ef1ae0e08f2d6467bc255295a032745fdbdce45beab1d32e489dd6a7762afcc5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page