Shampoo (Second order Optimizer for Deep Learning) Optax Optimizer
Project description
Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent. Second-order optimization methods, that involve second derivatives and/or second order statistics of the data, are far less prevalent despite strong theoretical properties, due to their prohibitive computation, memory and communication costs.
Here we present a scalable implementation of a second-order preconditioning method (concretely, a variant of full-matrix Adagrad) that provides significant convergence and wall-clock time improvements compared to conventional first-order methods on state-of-the-art deep models.
Paper preprints: https://arxiv.org/abs/2002.09018
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for optax_shampoo-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a24de229d5f8f148bf12dfdc05d6a114e41c735f448b687026468389f8cc4de0 |
|
MD5 | 339fda57c570000e509188d941433bde |
|
BLAKE2b-256 | cd8428bb5866e38900a39b37fc7aa7c008c9736aa75c8211aeb1a607e61243d2 |