No project description provided
Project description
Advanced PyTorch Optimizer: SophiaOptim
Overview
This project introduces SophiaOptim, my take on the Sophia (Second-order Clipped Stochastic Optimization) optimizer. A custom PyTorch optimizer designed to enhance the training performance of deep learning models, SophiaOptim integrates sophisticated optimization techniques, including Lookahead mechanics and Exponential Moving Average (EMA) of model weights, alongside adaptive learning rates and Hessian-based updates for efficient and robust model optimization.
Features
Lookahead Mechanism: Implements the Lookahead optimization strategy to maintain two sets of weights, providing a more stable and consistent training process. Exponential Moving Average (EMA): Utilizes EMA of model weights for smoother optimization, leveraging the averaged weights for evaluation to achieve better generalization performance.
Adaptive Learning Rate: Adapts learning rates based on the Hessian diagonal's estimation, allowing for more informed update steps. Hutchinson’s Hessian Estimation: Estimates the Hessian diagonal efficiently using Hutchinson's method, incorporating second-order information without the computational overhead of full Hessian computation.
Installation
To use SophiaOptim in your project, ensure you have PyTorch installed. Install the package and import SophiaOptim and Lookahead into your training script:
pip install sophia-optim==0.0.4
from sophia_optim import SophiaOptim, Lookahead
Usage
To integrate SophiaOptim into your training loop, initialize the optimizer with your model's parameters and specify any desired configurations:
optimizer = SophiaOptim(model.parameters(), lr=1e-3, betas=(0.9, 0.999), eps=1e-8, rho=0.1, weight_decay=0.01, ema_decay=0.999)
Use the optimizer in your training loop as you would with any standard PyTorch optimizer. Remember to update the EMA weights and apply them for model evaluation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sophia_optim-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 43dbf70ff15b30b9149224e2015d44d7b3a36dc67b067931459305e64cc6737a |
|
MD5 | 27e88cc30c221f03b81c04c9ddae4b4a |
|
BLAKE2b-256 | e859e5f014acab2faa0ee72e7cb81e28e58b69de2988fa26b48c30c2a8313b5f |