Library to simplify autograd computations in PyTorch

## Project description

By Yaroslav Bulatov, Kazuki Osawa

Library to simplify gradient computations in PyTorch.

# example 1: per-example gradient norms

Example of using it to compute per-example gradient norms for linear layers, using trick from https://arxiv.org/abs/1510.01799

See example_norms.py for a runnable example. The important parts:

!pip install autograd-lib

loss_fn = ...
data = ...
model = ...

activations = {}

def save_activations(layer, A, _):
activations[layer] = A

output = model(data)
loss = loss_fn(output)

norms = [torch.zeros(n)]

def per_example_norms(layer, _, B):
A = activations[layer]
norms+=(A*A).sum(dim=1)*(B*B).sum(dim=1)

loss.backward()



# Example 2: Hessian quantities

To compute exact Hessian, Hessian diagonal and KFAC approximation for all linear layers of a ReLU network in a single pass.

See example_hessian.py for a self-contained example. The important parts:

!pip install autograd-lib

hess = defaultdict(float)
hess_diag = defaultdict(float)
hess_kfac = defaultdict(lambda: AttrDefault(float))

activations = {}
def save_activations(layer, A, _):
activations[layer] = A

# KFAC left factor
hess_kfac[layer].AA += torch.einsum("ni,nj->ij", A, A)

output = model(data)
loss = loss_fn(output, targets)

def compute_hess(layer, _, B):
A = activations[layer]
BA = torch.einsum("nl,ni->nli", B, A)

# full Hessian
hess[layer] += torch.einsum('nli,nkj->likj', BA, BA)

# Hessian diagonal
hess_diag[layer] += torch.einsum("ni,nj->ij", B * B, A * A)

# KFAC right factor
hess_kfac[layer].BB += torch.einsum("ni,nj->ij", B, B)



Variations:

• autograd_lib.backward_hessian for Hessian
• autograd_lib.backward_jacobian for Jacobian squared
• loss.backward() for empirical Fisher Information Matrix

## Project details

This version 0.0.7 0.0.6 0.0.5 0.0.4 0.0.1

Uploaded source
Uploaded py3