Skip to main content

Deep Learning optimizers developed in the Distributed Algorithms and Systems group (DASLab) @ Institute of Science and Technology Austria (ISTA)

Project description

ISTA DAS Lab Optimization Algorithms Package

This repository contains optimization algorithms for Deep Learning developed by the Distributed Algorithms and Systems lab at Institute of Science and Technology Austria.

Project status

  • June 5th, 2024:
    • DONE: the project is locally installable via pip install .
    • NEXT:
      • working on examples for Sparse M-FAC and Dense M-FAC
  • May 27th, 2024:
    • we are currently working on solving the issues with the installation via pip.

Installation

We provide a script install.sh that creates a new environment, installs requirements and then builds the optimizers project. First of all, you have to clone this repository, then run the installation script.

git clone git@github.com:IST-DASLab/ISTA-DASLab-Optimizers.git
cd ISTA-DASLab-Optimizers
source install.sh

⚠️ Important Notice ⚠️

We noticed it is useful to compile the kernels for each individual CUDA capability separately. For example, for CUDA capability (CC) 8.6, the CUDA kernels for MicroAdam will be installed in the package micro_adam_sm86, while for CC 9.0 it will be installed in the package micro_adam_sm90. Please install this library for each system where the CC is different to cover all possible cases for your system. The code will automatically detect the CC version and import the correct package if installed, otherwise will throw an error. The code that dynamically detects the CC version can be found here.

How to use optimizers?

We provide a minimal working example with ResNet-18 and CIFAR-10 for optimizers micro-adam, acdc, sparse-mfac, dense-mfac:

OPTIMIZER=micro-adam # or any other optimizer listed above
bash run_${OPTIMIZER}.sh

MicroAdam optimizer

from ista_daslab_optimizers import MicroAdam

model = MyCustomModel()

optimizer = MicroAdam(
    model.parameters(), # or some custom parameter groups
    m=10, # sliding window size (number of gradients)
    lr=1e-5, # change accordingly
    quant_block_size=100_000, # 32 or 64 also works
    k_init=0.01, # float between 0 and 1 meaning percentage: 0.01 means 1%
)

# from now on, you can use the variable `optimizer` as any other PyTorch optimizer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ista_daslab_optimizers-0.0.1.tar.gz (43.0 kB view hashes)

Uploaded Source

Built Distribution

ista_daslab_optimizers-0.0.1-cp39-cp39-manylinux_2_34_x86_64.whl (1.0 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.34+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page