PyTorch implementation of L2L execution algorithm
Project description
L2L execution algorithm PyTorch
PyTorch implementation of L2L execution algorithm from paper Training Large Neural Networks with Constant Memory using a New Execution Algorithm
🚀 Exapmle
You need to define a torch model where all layers are specified in ModuleList.
for example
import torch
from torch import nn, optim
class M(nn.Module):
def __init__(self, depth: int, dim: int, hidden_dim: Optional[int] = None):
super().__init__()
hidden_dim = hidden_dim or dim
self.layers = nn.ModuleList(
[
nn.Sequential(
nn.Linear(dim, hidden_dim),
nn.BatchNorm1d(hidden_dim),
nn.LeakyReLU(),
)
]
+ [
nn.Sequential(
nn.Linear(hidden_dim, hidden_dim),
nn.BatchNorm1d(hidden_dim),
nn.LeakyReLU(),
)
for i in range(depth)
]
+ [nn.Linear(hidden_dim, dim), nn.Sigmoid()]
)
def forward(self, batch: torch.Tensor) -> torch.Tensor:
x = batch
for l in self.layers:
x = l(x)
return x
Then, you can use the L2L wrapper over this model.
from layer_to_layer_pytorch.l2l import Layer2Layer
model = M(depth=5, dim=40).train() # on CPU
l2l_model = Layer2Layer(
model,
layers_attr="layers", # attribute with ModuleList
microbatch_size=100, # size of a microbatch in a minibatch :) from original paper
verbose=False # enable tqdm
)
And train it, like torch model (almost):
from tqdm.auto import tqdm, trange
x = torch.rand(1_000, 40) # on CPU
y = torch.rand(1_000, 40) # on CPU
losses = []
loss_fn = nn.MSELoss(reduction="sum") # since L2L calcs average losses itself, we just need to save them
optimizer = optim.AdamW(l2l_model.main_model.parameters(), lr=0.001) # optimizer works with the main model on CPU
for i in trange(5000):
l2l_model.zero_grad()
l2l_model.forward(x)
loss_value = l2l_model.backward(x, y, loss_fn)
if i % 50 == 0:
tqdm.write(f"[{i}] loss = {loss_value.item()}")
losses.append(loss_value.item())
optimizer.step()
Installation
pip install layer-to-layer-pytorch
or install with Poetry
poetry add layer-to-layer-pytorch
📈 Releases
You can see the list of available releases on the GitHub Releases page.
We follow Semantic Versions specification.
🛡 License
This project is licensed under the terms of the MIT
license. See LICENSE for more details.
📃 Citation
This library
@misc{layer-to-layer-pytorch,
author = {Roman Tezikov},
title = {PyTorch implementation of L2L execution algorithm},
year = {2020},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/TezRomacH/layer-to-layer-pytorch}}
}
Original paper
@article{Pudipeddi2020TrainingLN,
title={Training Large Neural Networks with Constant Memory using a New Execution Algorithm},
author={Bharadwaj Pudipeddi and Maral Mesmakhosroshahi and J. Xi and S. Bharadwaj},
journal={ArXiv},
year={2020},
volume={abs/2002.05645}
}
Credits
This project was generated with python-package-template
.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for layer-to-layer-pytorch-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5eae57eba88e76520885873fec54f7052c1055225bd05f184ce1037d714cd2e |
|
MD5 | 9c32cfc621e688f1888eb4f3b5079f90 |
|
BLAKE2b-256 | 16de4ed07709c2fa9ea627ba37223781e5703780b7ff39517a5a66eff38f6e5e |
Hashes for layer_to_layer_pytorch-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92b90f5f22f67a3e679e7cc5fe2a345a2f9c7589cfa96c759098977b91e10250 |
|
MD5 | f171d839c9314bb151b47265b52c111c |
|
BLAKE2b-256 | c180d066fd21c214ad6ae90ac677225a8a315ba7f229c1d951404ab5fb450524 |