Deep Learning optimizers developed in the Distributed Algorithms and Systems group (DASLab) @ Institute of Science and Technology Austria (ISTA)
Project description
ISTA DAS Lab Optimization Algorithms Package
This repository contains optimization algorithms for Deep Learning developed by the Distributed Algorithms and Systems lab at Institute of Science and Technology Austria.
Project status
- June 5th, 2024:
- DONE: the project is locally installable via
pip install .
- NEXT:
- working on examples for Sparse M-FAC and Dense M-FAC
- DONE: the project is locally installable via
- May 27th, 2024:
- we are currently working on solving the issues with the installation via
pip
.
- we are currently working on solving the issues with the installation via
Installation
We provide a script install.sh
that creates a new environment, installs requirements
and then builds the optimizers project. First of all, you have to clone this repository, then
run the installation script.
git clone git@github.com:IST-DASLab/ISTA-DASLab-Optimizers.git
cd ISTA-DASLab-Optimizers
source install.sh
⚠️ Important Notice ⚠️
We noticed it is useful to compile the kernels for each individual CUDA capability separately. For example, for CUDA capability (CC) 8.6,
the CUDA kernels for MicroAdam
will be installed in the package micro_adam_sm86
, while for CC 9.0 it will be installed in the package
micro_adam_sm90
. Please install this library for each system where the CC is different to cover all possible cases for your system. The
code will automatically detect the CC version and import the correct package if installed, otherwise will throw an error. The code that
dynamically detects the CC version can be found
here.
How to use optimizers?
We provide a minimal working example with ResNet-18 and CIFAR-10 for optimizers micro-adam
, acdc
, sparse-mfac
, dense-mfac
:
OPTIMIZER=micro-adam # or any other optimizer listed above
bash run_${OPTIMIZER}.sh
MicroAdam optimizer
from ista_daslab_optimizers import MicroAdam
model = MyCustomModel()
optimizer = MicroAdam(
model.parameters(), # or some custom parameter groups
m=10, # sliding window size (number of gradients)
lr=1e-5, # change accordingly
quant_block_size=100_000, # 32 or 64 also works
k_init=0.01, # float between 0 and 1 meaning percentage: 0.01 means 1%
)
# from now on, you can use the variable `optimizer` as any other PyTorch optimizer
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ista_daslab_optimizers-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18aceda9d7b39d534545a4451d97e6a32f0f8d99b471652146a344976fb950ef |
|
MD5 | 69aef743672509b4a4d75827bf46d40f |
|
BLAKE2b-256 | 8d19e53d83bf2c152586db491f81a41bbbaa2ead77ce0fd004333db8fe2de809 |
Hashes for ista_daslab_optimizers-0.0.1-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f7f49a105fb4f173f1e2fdbeb8f080b390336ef978a4a763a1cd2bf71bccf9c |
|
MD5 | 8f438a0c59d080a5a94b80232cd55c65 |
|
BLAKE2b-256 | e5cf3ffbcf4f442afc27c1ed0ac21726558da621c05e31aba821e75ef5ec9ff1 |