Skip to main content

cuPyLMA: a Multi-GPU Levenberg-Marquardt Optimizer powered by cuPyNumeric

Project description

cuPyLMA: a Multi-GPU Levenberg-Marquardt (Deep Learning) Optimizer Powered by NVIDIA cuPyNumeric

Background | Installation | Training | Examples | Performance | Change logs

cuPyLMA is a scalable multi-GPU (deep learning optimizer) optimizer which implements the Levenberg-Marquardt algorithm (LMA). This library is built on PyTorch and NVIDIA cuPyNumeric (a NumPy-like scientific computing framework).

Background

The Levenberg-Marquardt algorithm (LMA) is a second-order optimization algorithm that utilizes the Jacobian matrix of the residuals to compute optimal parameter updates. In contrast, the widely used first-order optimizer Adam relies on the gradient of the loss function to determine these updates.

$$ \large (\mathbf{J}^T\mathbf{J}+\lambda \mathbf{I})\triangle\mathbf{x} = \mathbf{J}^T\mathbf{r} $$

($\mathbf{J}$: Jacobian matrix of residuals, $\mathbf{r}$: residuals, $\triangle\mathbf{x}$: updates to be solved)

The LMA has the following advantages and disadvantages compared to the Adam:

  • Pros
    • Faster convergence.
    • More optimal solutions due to using the second-order information.
  • Cons
    • Higher memory and computation requirement due to computing the Jacobian matrix and solving the equation, especially when the model has many parameters.

Our cuPyLMA aims to resolve the memory and computation bottlenecks of the LMA via utilizing multiple GPUs.

Installation

To install cuPyLMA along with dependencies, please run:

pip install cupylma

Training

It is easy to migrate the training code that uses the Adam optimizer to cuPyLMA. cuPyLMA consists of the following components and each holds a seperate set of GPUs.

  • Model component stores the model parameters and computes the Jacobian matrix.
  • Optimizer component stores the Jacobian matrix and computes the optimal parameter updates.

Creating the model

The model should be in one of GPUs held by the model component. The get_available_gpus() function gets the list of available GPUs for the model component.

from cupylma import get_available_gpus

devices = get_available_gpus()
model = MyModel().to(devices[0])

Configuring the optimizer

The LMA optimizer requires a residual function rather than a loss function. The devices option specifies the GPUs for the model component.

from cupylma import LMA

residual_fn = lambda a, b : a - b # For simple regression
lma = LMA(model, devices, residual_fn)

To find the residual function for more complex problems, please check examples/mnist.

Training

The LMA optimizer is stateless, so there is no need to reset gradients at each step. The loss return value is the average loss. The terminated return value indicates whether the train should be terminated.

loss, terminated = lma.step(x, y)
if terminated:
    # Exit the train and save the model

Running the code

The legate command was installed together with cuPyLMA. The number of GPUs for the optimizer component is specified using the --gpus option.

legate --gpus 3 train.py

Examples

Performance

TODO

References

[1] fabiodimarco/torch-levenberg-marquardt: Our base code refers to the repository.

[2] H. P. Gavin, “The Levenberg-Marquardt algorithm for nonlinear least squares curve-fitting problems,” 2024.: It provides theoretical explanation of LMA.

Citation

J. Taylor, W. Wang, B. Bala, and T. Bednarz, “Optimizing the optimizer for data driven deep neural networks and physics informed neural networks,” May 16, 2022, arXiv: arXiv:2205.07430. doi: 10.48550/arXiv.2205.07430.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cupylma-0.2.0.dev5.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cupylma-0.2.0.dev5-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file cupylma-0.2.0.dev5.tar.gz.

File metadata

  • Download URL: cupylma-0.2.0.dev5.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for cupylma-0.2.0.dev5.tar.gz
Algorithm Hash digest
SHA256 20f02ef244913be198c7390d8e38ec06c8a0e4804595e4f1860ccb7db36ee4bd
MD5 2cfc2b11a720a629c427693f176e07f8
BLAKE2b-256 bbdda7b4807e30d8cc31f97e99b5d9d15fd6e0d0307b21e0be622e4b4b0bce12

See more details on using hashes here.

File details

Details for the file cupylma-0.2.0.dev5-py3-none-any.whl.

File metadata

  • Download URL: cupylma-0.2.0.dev5-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for cupylma-0.2.0.dev5-py3-none-any.whl
Algorithm Hash digest
SHA256 48baa74c9dfdbad869b581b1829d2f586e13edbad6d65522e5eaece6ac13d731
MD5 74056c55c6d9a80eb82c821d7e2be000
BLAKE2b-256 d0979ff589a96c4a218979dccb8309c0c245480cb31a6a504ec4a6eff807948e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page