Pytorch Lion optimizer with updated and advanced features.
Project description
Advanced Lion Optimizer
This repository provides an enhanced implementation of the Lion optimizer, incorporating several state-of-the-art techniques to improve performance, stability, and memory efficiency. It includes two base variants: the original Lion and D-Adapt-Lion.
Features
1. Fused Backward Pass
Reduces memory overhead by hooking gradients as they become available during the backward pass, eliminating the need to store them explicitly via the step_parameter method.
2. Stochastic Rounding for BF16 Training
Achieves FP32-level performance in BF16 training by implementing stochastic rounding for the final update. This allows for faster training with lower-precision data types without sacrificing accuracy.
- References:
3. Gradient Orthogonalization
Improves model generalization and enhances numerical stability by modifying gradients to be orthogonal to the parameters.
- Reference: "Grokking at the Edge of Numerical Stability"
4. Variance Reduction
Theoretically accelerates the convergence speed of Lion by 33.33% while making training more stable in noisy, small-batch environments. The main trade-off is the requirement of an additional state to store gradients from the previous step.
5. Cautious Lion Variant
Includes the "Cautious" variant of Lion, an approach introduced to refine the optimization process and improve training outcomes.
6. Per-Parameter Gradient Norm Clipping
Enhances training stability by applying gradient norm clipping on a per-parameter basis, preventing erratic updates from large gradients.
- Reference: "Lions and Muons: Optimization via Stochastic Frank-Wolfe" (The paper uses a clipping value of 4-5).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adv_lion-0.1.1.tar.gz.
File metadata
- Download URL: adv_lion-0.1.1.tar.gz
- Upload date:
- Size: 12.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
214e6b26c84ec502aeb396439d4bde2d908700c44c0f3cd753daaef0fe13b744
|
|
| MD5 |
aa217388e10c9633646dfab2f8c1cac8
|
|
| BLAKE2b-256 |
829368d424a66151232316f6a0f4d6fccb67641676f05161c930899e52038d79
|
File details
Details for the file adv_lion-0.1.1-py3-none-any.whl.
File metadata
- Download URL: adv_lion-0.1.1-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
910aa94b15052e516aa25c82a69d1b034e1ff6652de8885e563d89070625d0cb
|
|
| MD5 |
551ad26d657c836dcb53c322d37cdc26
|
|
| BLAKE2b-256 |
038ce9d7a399148114391c9960177ca52db044915d29585b330ea47f53e4e1b3
|