Skip to main content

Adam with weight recovery optimizer for pytorch

Project description

AdamR

Adam with weight Recovery optimizer

TL;DR

AdamW tends to decay parameters towards zero, which makes the model "forget" the pretrained parameters during finetuning. Instead, AdamWR tries to recover parameters towards pretrained values during finetuning.

Have a try

Just like other PyTorch optimizers,

from adamr import AdamR
from xxx import SomeModel, SomeData, SomeDevice, SomeLoss

model = SomeModel()
dataloader = SomeData()
model.to(SomeDevice)

adamr = AdamR(
   model.parameters(),
   lr=1e-5,
   betas=(0.9, 0.998), # Adam's beta parameters
   eps=1e-8,
   weight_recovery=0.1
   )

loss_fn = SomeLoss()

for x, y in dataloader:
   adamwr.zero_grad()
   y_bar = model(x)
   loss = loss_fn(y_bar, y)
   loss.backward()
   adamr.step()

Algorithm

TODO: improve the readability

Here is a paper snippet: image

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamr-0.0.2.tar.gz (316.8 kB view hashes)

Uploaded Source

Built Distribution

adamr-0.0.2-py3-none-any.whl (10.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page