Skip to main content

SOLO

Project description

Logo

Paper | PyTorch >= 2.3 | torchao >= 0.7.0

Installation

pip install solow

or

pip install git+https://github.com/MTandHJ/SOLO.git

Usage

from solo.adamw import AdamWQ

optimizer = AdamWQ(
    model.parameters(),
    lr = 0.001,
    weight_decay = 0.,
    betas = (0.8, 0.999),
    bits = (4, 2), # (4 bits for signed, 2 bits for unsigned)
    quantile = 0.1,
    block_sizes = (128, 128),
    quantizers = ('de', 'qema'),
    # A tensor whose size is less than `min_quantizable_tensor_size`
    # will be excluded from quantization.
    # For rigorous probing, this value is set to 0 in paper.
    # Assigning a larger value (such as the default of 4096 in torchao) 
    # may yield more stable results.
    min_quantizable_tensor_size = 128
)
  • quantizers:
    • none: The orginal 32-bit state.
    • bf16: The BF16 format.
    • de: The dynamic exponent mapping without a stochastic rounding.
    • de-sr: The dynamic exponent mapping with a stochastic rounding.
    • linear: The linear mapping without a stochastic rounding.
    • linear-sr: The linear mapping with a stochastic rounding.
    • qema: The proposed logarithmic quantization.

[!TIP] SOLO can be utilized in conjunction with the Trainer by specifying the optimizer_cls_and_kwargs parameter.

[!NOTE] DeepSpeed may produce an excessively large tensor, leading to unexpected OOM errors caused by intermediate buffers during the quantization process. It is recommended to reduce the sub_group_size to mitigate this issue.

Reference Code

  • pytorch-optimizer: We implemented the low-bit Adafactor and AdaBelief optimiers based on this code.

Citation

@article{xu2025solo,
  title={Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics},
  author={Xu, Cong and Liang, Wenbin and Yu, Mo and Liu, Anan and Zhang, Ke-Yue and Ma, Lizhuang and Wang, Jianyong and Wang, Jun and Zhang, Wei},
  journal={arXiv preprint arXiv:2505.00347},
  year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

solow-0.1.5.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

solow-0.1.5-py3-none-any.whl (57.5 kB view details)

Uploaded Python 3

File details

Details for the file solow-0.1.5.tar.gz.

File metadata

  • Download URL: solow-0.1.5.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for solow-0.1.5.tar.gz
Algorithm Hash digest
SHA256 723fe1a3cf206952aa840c3e0e52225e92a9d77d1d475d3a7505ac6b5dfd8cb1
MD5 0ac99f5c49466310fae498645f171f4b
BLAKE2b-256 a91b31c1e2de1c27be317215a1c412b4465b77ea29fe22ec08d609f8438bc74e

See more details on using hashes here.

File details

Details for the file solow-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: solow-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 57.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for solow-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b5dc955be341aa35eecde93d5300fc60ff0eaf5036572872a7127e9a90e7e099
MD5 4df13e4402104e71f0c2d26136bde80a
BLAKE2b-256 7fa9483e1f7bca763981aeebe3495ed70a26884be39b6612c621d6024e67cc7b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page