Skip to main content

ISAAC Newton - a method for accelerating neural network training.

Project description

isaac Newton - Accelerating NN Training with Input-based Approximate Curvature for Newton's Method

This repository includes the official implementation of our ICLR 2023 Paper "ISAAC Newton: Input-based Approximate Curvature for Newton's Method".

Paper @ OpenReview

Video @ Youtube

video

💻 Installation

isaac is based on PyTorch and can be installed via pip from PyPI with

pip install isaac

👩‍💻 Usage

isaac.Linear acts as a drop-in replacement for torch.nn.Linear. It only requires additional specification of the regularization parameter $\lambda_a$ la as specified in the paper. A good starting point for $\lambda_a$ is la=1, but the optimal choice varies from experiment to experiment. The method operates by efficiently modifying the gradient of the module in such a way that the input-based curvature information is used when applying a gradient descent optimizer on the modified gradients.

In the following, we specify an example MNIST neural network where ISAAC is applied to the first 3 out of 5 layers:

import torch
import isaac

net = torch.nn.Sequential(
    torch.nn.Flatten(),
    isaac.Linear(784, 1_600, la=1),
    torch.nn.ReLU(),
    isaac.Linear(1_600, 1_600, la=1),
    torch.nn.ReLU(),
    isaac.Linear(1_600, 1_600, la=1),
    torch.nn.ReLU(),
    torch.nn.Linear(1_600, 1_600),
    torch.nn.ReLU(),
    torch.nn.Linear(1_600, 10)
)

🧪 Experiments

You can find an example MNIST experiment in examples/mnist.py, which is based on the experiment in Figure 5 in the paper.

To run ISAAC applied to the first X out of 5 layers, run

python examples/mnist.py -nil <X>

To run the baseline, run

python examples/mnist.py -nil 0

The device can be specified, e.g., as --device cuda, the learning rate and $\lambda_a$ may be set via --lr and --la, respectively.

📖 Citing

@inproceedings{petersen2023isaac,
  title={ISAAC Newton: Input-based Approximate Curvature for Newton's Method},
  author={Petersen, Felix and Sutter, Tobias and Borgelt, Christian and Huh, Dongsung and Kuehne, Hilde and Sun, Yuekai and Deussen, Oliver},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2023}
}

License

isaac is released under the MIT license. See LICENSE for additional details about it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isaac-0.1.1.tar.gz (4.9 kB view hashes)

Uploaded Source

Built Distribution

isaac-0.1.1-py3-none-any.whl (5.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page