Skip to main content

Cosine annealing learning rate scheduler for PyTorch based on SGDR

Project description

Torch Cosine Annealing

PyPI version Build Status

Implementation of cosine annealing scheduler introduced in SGDR paper. Compared to the original implementation, it has the following additional features:

  • Support linear warm-up/burn-in period
  • Linear warm-up can be applied only to the first cycle
  • Support float values for the warmup period, cycle period ($T_0$), and cycle_mult ($T_{mult}$)
  • Support multiple learning rates for each param group
  • Scheduler can be updated by epoch or step progress

Installation

pip install torch-cosine-annealing

Quick Start

In the following examples, assume any standard PyTorch model and optimizer are defined.

Using step Strategy

from torch_cosine_annealing import CosineAnnealingWithWarmRestarts

scheduler = CosineAnnealingWithWarmRestarts(
    optimizer, 
    cycle_period=50, 
    cycle_mult=1, 
    warmup_period=5, 
    min_lr=1e-7, 
    gamma=1, 
    strategy='step',
)

for epoch in range(100):
    for data in dataloader:
        # insert training logic here
        
        scheduler.step()

Using epoch Strategy

from torch_cosine_annealing import CosineAnnealingWithWarmRestarts

scheduler = CosineAnnealingWithWarmRestarts(
    optimizer, 
    cycle_period=1, 
    cycle_mult=1, 
    warmup_period=0.1, 
    min_lr=1e-8, 
    gamma=1, 
    strategy='epoch',
)

for epoch in range(100):
    for i, data in enumerate(dataloader):
        # insert training logic here
        
        scheduler.step((epoch * len(dataloader) + i + 1) / len(dataloader))

Arguments

The CosineAnnealingWithWarmRestarts class has the following arguments:

  • optimizer (Optimizer): PyTorch optimizer
  • cycle_period (Union[float, int]): The period for the first cycle. If strategy is 'step', this is the number of steps in the first cycle. If strategy is 'epoch', this is the number of epochs in the first cycle.
  • cycle_mult (float): The multiplier for the cycle period after each cycle. Defaults to 1.
  • warmup_period (Union[float, int]): The period for warmup for each cycle. If strategy is 'step', this is the number of steps for the warmup. If strategy is 'epoch', this is the number of epochs for the warmup. Defaults to 0.
  • warmup_once (bool): Whether to apply warmup only once at the beginning of the first cycle. Only affects when warmup_period > 0. Defaults to False.
  • max_lr (Union[float, List[float]], optional): The maximum learning rate for the optimizer (eta_max). If omitted, the learning rate of the optimizer will be used. If a float is given, all lr in the optimizer param groups will be overridden with this value. If a list is given, the length of the list must be the same as the number of param groups in the optimizer. Defaults to None.
  • min_lr (float, optional): The minimum learning rate for the optimizer (eta_min). Defaults to 1e-8.
  • gamma (float, optional): The decay rate for the learning rate after each cycle. Defaults to 1.
  • strategy (str, optional): Defines whether the cycle period and warmup period to be treated as steps or epochs. Can be step or epoch. Note that if you use epoch, you need to specify the epoch progress each time you call .step(). Defaults to step.

Use Cases

Restart every 50 steps without warmup, no decay, constant restart period

strategy='step', cycle_period=50, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=0, gamma=1 Ex1

Restart every 1 epoch without warmup, decay learning rate by 0.8 every restart, constant restart period

Note: In this example, one epoch consists of 50 steps.

strategy='epoch', cycle_period=1, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=0, gamma=0.8 Ex1

Restart every 50 steps with 5 steps warmup, no decay, constant restart period

strategy='step', cycle_period=50, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=5, gamma=1 Ex1

Restart every 2 epoch with 0.5 epoch warmup only on first restart, no decay, restart period multiplied by 1.5 every restart

Note: In this example, one epoch consists of 50 steps.

strategy='epoch', cycle_period=2, cycle_mult=1.5, max_lr=1e-3, min_lr=1e-7, warmup_period=0.5, warmup_once=True, gamma=1 Ex1

Restart every 25 steps with 5 steps warmup only on first restart, decay learning rate by 0.8 and restart period multiplied by 1.5 every restart, apply to multiple learning rates

strategy='step', cycle_period=25, cycle_mult=2, max_lr=[1e-3, 5e-4], min_lr=1e-7, warmup_period=5, warmup_once=True, gamma=0.8 Ex1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torch_cosine_annealing-0.1.3.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

torch_cosine_annealing-0.1.3-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file torch_cosine_annealing-0.1.3.tar.gz.

File metadata

File hashes

Hashes for torch_cosine_annealing-0.1.3.tar.gz
Algorithm Hash digest
SHA256 926ba3c7013dee56e529c428580127a5d7f94b71b8c8eee4d99c09167cec7dc5
MD5 2bd4ee5c4280a09a7b29a7f3511345de
BLAKE2b-256 99c3a26b89dcfe9e799864d7e537bf892e945e69465e70fb9f7aaa373d9a776f

See more details on using hashes here.

File details

Details for the file torch_cosine_annealing-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for torch_cosine_annealing-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c1582cfc7f8864bec0d42329a05e1e054ab6c131033d42863ada00a40639eb3f
MD5 a52ef1a8d527ac1f67cb414e128db83f
BLAKE2b-256 7f44c4258bedb868bf3090a1c03478920004e8b498a8a41b03430e0778239365

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page