Skip to main content

ISAAC Newton - a method for accelerating neural network training.

Project description

isaac Newton - Accelerating NN Training with Input-based Approximate Curvature for Newton's Method

This repository includes the official implementation of our ICLR 2023 Paper "ISAAC Newton: Input-based Approximate Curvature for Newton's Method".

Paper @ OpenReview

Video @ Youtube

video

💻 Installation

isaac is based on PyTorch and can be installed via pip from PyPI with

pip install isaac

👩‍💻 Usage

isaac.Linear acts as a drop-in replacement for torch.nn.Linear. It only requires additional specification of the regularization parameter $\lambda_a$ la as specified in the paper. A good starting point for $\lambda_a$ is la=1, but the optimal choice varies from experiment to experiment. The method operates by efficiently modifying the gradient of the module in such a way that the input-based curvature information is used when applying a gradient descent optimizer on the modified gradients.

In the following, we specify an example MNIST neural network where ISAAC is applied to the first 3 out of 5 layers:

import torch
import isaac

net = torch.nn.Sequential(
    torch.nn.Flatten(),
    isaac.Linear(784, 1_600, la=1),
    torch.nn.ReLU(),
    isaac.Linear(1_600, 1_600, la=1),
    torch.nn.ReLU(),
    isaac.Linear(1_600, 1_600, la=1),
    torch.nn.ReLU(),
    torch.nn.Linear(1_600, 1_600),
    torch.nn.ReLU(),
    torch.nn.Linear(1_600, 10)
)

🧪 Experiments

You can find an example MNIST experiment in examples/mnist.py, which is based on the experiment in Figure 5 in the paper.

To run ISAAC applied to the first X out of 5 layers, run

python examples/mnist.py -nil <X>

To run the baseline, run

python examples/mnist.py -nil 0

The device can be specified, e.g., as --device cuda, the learning rate and $\lambda_a$ may be set via --lr and --la, respectively.

📖 Citing

@inproceedings{petersen2023isaac,
  title={ISAAC Newton: Input-based Approximate Curvature for Newton's Method},
  author={Petersen, Felix and Sutter, Tobias and Borgelt, Christian and Huh, Dongsung and Kuehne, Hilde and Sun, Yuekai and Deussen, Oliver},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2023}
}

License

isaac is released under the MIT license. See LICENSE for additional details about it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isaac-0.1.1.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

isaac-0.1.1-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file isaac-0.1.1.tar.gz.

File metadata

  • Download URL: isaac-0.1.1.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for isaac-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1dbed519ba68da6eb6542f91a8670ca35b988ddae04dfd6ffa9b90861f1fee58
MD5 5800ec85d54842d079dc48147340bac8
BLAKE2b-256 716722abbddaee1c048efd8a368e1aa59442aeb6eb9e5f435ce55710f71d2901

See more details on using hashes here.

File details

Details for the file isaac-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: isaac-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for isaac-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c6f862e06358dc15a477a0a9a38e90bcffa5797a917c1ebbf5179ef37cbcea3e
MD5 f947be0a851cc59c9e3ce96928a59811
BLAKE2b-256 991c60445ba7a407c58d62e77e44d4f183fe2d4f450631b26930cfd670bfa0b3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page