Skip to main content

A PyTorch implementation of DropGrad regularization.

Project description

DropGrad: A Simple Method for Regularization and Accelerated Optimization of Neural Networks

DropGrad is a regularization method for neural networks that works by randomly (and independently) setting gradient values to zero before an optimization step. Similarly to Dropout, it has a single parameter, drop_rate, the probability of setting each parameter gradient to zero. In order to de-bias the remaining gradient values, they are divided by 1.0 - drop_rate.

To the best of my knowledge DropGrad is an original contribution. However, I have no plans of publishing a paper. If indeed, it is an original method, please feel free to publish a paper about DropGrad. If you do so, all I ask is that you mention me in your publication and cite this repository.

Installation

The PyTorch implementation of DropGrad can be installed simply using pip or by cloning the current GitHub repo.

Requirements

The only requirement for DropGrad is PyTorch. (Only versions of PyTorch >= 2.0 have been tested, although DropGrad should be compatible with any version of PyTorch)

Using pip

To install using pip:

pip install dropgrad

Using git

git clone https://github.com/dingo-actual/dropgrad.git
cd dropgrad
python -m build
pip install dist/dropgrad-0.1.0-py3-none-any.whl

Usage

Basic Usage

To use DropGrad in your neural network optimization, simply import the DropGrad class to wrap your optimizer.

from dropgrad import DropGrad

Wrapping an optimizer is similar to using a learning rate scheduler:

opt_unwrapped = Adam(net.parameters(), lr=1e-3)
opt = DropGrad(opt_unwrapped, drop_rate=0.1)

During training, the application of DropGrad is automatically handled by the wrapper. Simply call .step() on the wrapped optimizer to apply DropGrad then .zero_grad() to reset the gradients.

opt.step()
opt.zero_grad()

Use with Learning Rate Schedulers

If you use a learning rate scheduler as well as DropGrad, simply pass the base optimizer to both the DropGrad wrapper and the learning rate scheduler:

opt_unwrapped = Adam(net.parameters(), lr=1e-3)
lr_scheduler = CosineAnnealingLR(opt_unwrapped, T_max=100)
opt = DropGrad(opt_unwrapped, drop_rate=0.1)

During the training loop, you call .step() on the DropGrad wrapper before calling .step() on the learning rate scheduler, similarly to using an optimizer without DropGrad:

for epoch_n in range(n_epochs):
    for x_batch, y_batch in dataloader:
        pred_batch = net(x_batch)
        loss = loss_fn(pred_batch, y_batch)

        loss.backward()

        opt.step()
        opt.zero_grad()

    lr_scheduler.step()

Varying drop_rate per Parameter

DropGrad allows the user to set a different drop rate for each Parameter under optimization. To do this, simply pass a dictionary mapping Parameters to drop rates to the drop_rate argument of the DropGrad wrapper. If a dictionary is passed to DropGrad during initialization, all optimized Parameters that are not present in that dictionary will have the drop rate passed to the DropGrad wrapper at initialization (if drop_rate=None then drop grad simply won't be applied to Parameters that are not present in the dictionary).

The example below will apply a drop_rate of 0.1 to all optimized weights and a drop_rate of 0.01 to all optimized biases, with no DropGrad applied to any other optimized Parameters:

drop_rate_weights = 0.1
drop_rate_biases = 0.01

params_weights = [p for name, p in net.named_parameters() if p.requires_grad and 'weight' in name]
params_biases = [p for name, p in net.named_parameters() if p.requires_grad and 'bias' in name]

param_drop_rates = {p: drop_rate_weights for p in params_weights}
param_drop_rates.update({p: drop_rate_biases for p in params_biases})

opt_unwrapped = Adam(net.parameters(), lr=1e-3)
opt = DropGrad(opt_unwrapped, drop_rate=None, params=param_drop_rates)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dropgrad-0.1.0.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

dropgrad-0.1.0-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file dropgrad-0.1.0.tar.gz.

File metadata

  • Download URL: dropgrad-0.1.0.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for dropgrad-0.1.0.tar.gz
Algorithm Hash digest
SHA256 eca1d9d76a362647fe6e2c130c8d803a496bb21364203852d344d176f64242f0
MD5 2c6ae56ae38a3db0d6dea1a658319296
BLAKE2b-256 5e74893d211d46589c19de8a1af61202c7284b6fb3029d415395cc4428247ca1

See more details on using hashes here.

File details

Details for the file dropgrad-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dropgrad-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for dropgrad-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7d10fe2011a65d6a97eb1a65408df427cc3b39741b979e7d4c1faa01ba8941eb
MD5 7217c896d7c327143a8bafb0f19ffd89
BLAKE2b-256 2bf04d6546eb40d62816bdfb2b8eedc859a26f6fb54ecb9b3642609f1a2ba753

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page