ISAAC Newton - a method for accelerating neural network training.
Project description
isaac
Newton - Accelerating NN Training with Input-based Approximate Curvature for Newton's Method
This repository includes the official implementation of our ICLR 2023 Paper "ISAAC Newton: Input-based Approximate Curvature for Newton's Method".
Paper @ OpenReview
Video @ Youtube
💻 Installation
isaac
is based on PyTorch and can be installed via pip from PyPI with
pip install isaac
👩💻 Usage
isaac.Linear
acts as a drop-in replacement for torch.nn.Linear
. It only requires additional specification of
the regularization parameter $\lambda_a$ la
as specified in the paper. A good starting point for $\lambda_a$
is la=1
, but the optimal choice varies from experiment to experiment.
The method operates by efficiently modifying the gradient of the module in such a way that the input-based curvature
information is used when applying a gradient descent optimizer on the modified gradients.
In the following, we specify an example MNIST neural network where ISAAC is applied to the first 3 out of 5 layers:
import torch
import isaac
net = torch.nn.Sequential(
torch.nn.Flatten(),
isaac.Linear(784, 1_600, la=1),
torch.nn.ReLU(),
isaac.Linear(1_600, 1_600, la=1),
torch.nn.ReLU(),
isaac.Linear(1_600, 1_600, la=1),
torch.nn.ReLU(),
torch.nn.Linear(1_600, 1_600),
torch.nn.ReLU(),
torch.nn.Linear(1_600, 10)
)
🧪 Experiments
You can find an example MNIST experiment in examples/mnist.py
, which is based on the experiment in Figure 5 in the paper.
To run ISAAC applied to the first X
out of 5 layers, run
python examples/mnist.py -nil <X>
To run the baseline, run
python examples/mnist.py -nil 0
The device can be specified, e.g., as --device cuda
, the learning rate and $\lambda_a$ may be set via --lr
and --la
, respectively.
📖 Citing
@inproceedings{petersen2023isaac,
title={ISAAC Newton: Input-based Approximate Curvature for Newton's Method},
author={Petersen, Felix and Sutter, Tobias and Borgelt, Christian and Huh, Dongsung and Kuehne, Hilde and Sun, Yuekai and Deussen, Oliver},
booktitle={International Conference on Learning Representations (ICLR)},
year={2023}
}
License
isaac
is released under the MIT license. See LICENSE for additional details about it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file isaac-0.1.1.tar.gz
.
File metadata
- Download URL: isaac-0.1.1.tar.gz
- Upload date:
- Size: 4.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1dbed519ba68da6eb6542f91a8670ca35b988ddae04dfd6ffa9b90861f1fee58 |
|
MD5 | 5800ec85d54842d079dc48147340bac8 |
|
BLAKE2b-256 | 716722abbddaee1c048efd8a368e1aa59442aeb6eb9e5f435ce55710f71d2901 |
File details
Details for the file isaac-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: isaac-0.1.1-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c6f862e06358dc15a477a0a9a38e90bcffa5797a917c1ebbf5179ef37cbcea3e |
|
MD5 | f947be0a851cc59c9e3ce96928a59811 |
|
BLAKE2b-256 | 991c60445ba7a407c58d62e77e44d4f183fe2d4f450631b26930cfd670bfa0b3 |