SOLO
Project description
Paper | PyTorch >= 2.3 | torchao >= 0.7.0
Installation
pip install solow
or
pip install git+https://github.com/MTandHJ/SOLO.git
Usage
from solo.adamw import AdamWQ
optimizer = AdamWQ(
model.parameters(),
lr = 0.001,
weight_decay = 0.,
betas = (0.8, 0.999),
bits = (4, 2), # (4 bits for signed, 2 bits for unsigned)
quantile = 0.1,
block_sizes = (128, 128),
quantizers = ('de', 'qema'),
# A tensor whose size is less than `min_quantizable_tensor_size`
# will be excluded from quantization.
# For rigorous probing, this value is set to 0 in paper.
# Assigning a larger value (such as the default of 4096 in torchao)
# may yield more stable results.
min_quantizable_tensor_size = 128
)
quantizers:none: The orginal 32-bit state.bf16: The BF16 format.de: The dynamic exponent mapping without a stochastic rounding.de-sr: The dynamic exponent mapping with a stochastic rounding.linear: The linear mapping without a stochastic rounding.linear-sr: The linear mapping with a stochastic rounding.qema: The proposed logarithmic quantization.qema-unbiased: The proposed logarithmic quantization with unbiased stochastic rounding.
[!TIP] SOLO can be utilized in conjunction with the
Trainerby specifying theoptimizer_cls_and_kwargsparameter.
[!NOTE] DeepSpeed may produce an excessively large tensor, leading to unexpected OOM errors caused by intermediate buffers during the quantization process. It is recommended to reduce the
sub_group_sizeto mitigate this issue.
Reference Code
- pytorch-optimizer: We implemented the low-bit Adafactor and AdaBelief optimiers based on this code.
Citation
@article{xu2025solo,
title={Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics},
author={Xu, Cong and Liang, Wenbin and Yu, Mo and Liu, Anan and Zhang, Ke-Yue and Ma, Lizhuang and Wang, Jianyong and Wang, Jun and Zhang, Wei},
journal={arXiv preprint arXiv:2505.00347},
year={2025}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file solow-0.2.0.tar.gz.
File metadata
- Download URL: solow-0.2.0.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63e7af44660d3a41fcef3d0e6a155b354631dcd1b41039598366cc2381f8380f
|
|
| MD5 |
fe5bd3f067c2685e0afb3bb48fe19db0
|
|
| BLAKE2b-256 |
4a84b78eb73ee8495428b576d1654d84ca58bf0534528c0b89562c2549757a15
|
File details
Details for the file solow-0.2.0-py3-none-any.whl.
File metadata
- Download URL: solow-0.2.0-py3-none-any.whl
- Upload date:
- Size: 70.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef6ac935d43f1f2d3fe576e73422e38901da63c7974cfde346c4b168281df97a
|
|
| MD5 |
80b06c6d4a102915fbfd1217e02e3ff6
|
|
| BLAKE2b-256 |
7df22c1384a17dca80e83a20809dcad8cef2d5292f3c21c3767265f692cc2647
|