Skip to main content

AdamW Optimizer for bfloat16

Project description

AdamW optimizer for bfloat16 in PyTorch

This is a version of the AdamW optimizer for use in torch that achieves the same results in ViT training tests as training with the weights in float32 with operations in float32 or bfloat16 (autocast). By keeping your weights in bfloat16, you can save approximately half the weights they would normally take up in memory. It uses stochastic rounding and a correction term to achieve this.

There is a small (~10-20%) performance hit depending on your hardware.

To install:

pip install adamw-bf16

To use:

from adamw_bf16 import AdamWBF16

model = model.to(dtype=torch.bfloat16)
optimizer = AdamWBF16(model.parameters(), ...)

# Train your model

This repository was created using code from the following two projects. It was found that insights from both could be combined to match the performance with the model weights stored in float32.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamw_bf16-0.0.2.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

adamw_bf16-0.0.2-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file adamw_bf16-0.0.2.tar.gz.

File metadata

  • Download URL: adamw_bf16-0.0.2.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for adamw_bf16-0.0.2.tar.gz
Algorithm Hash digest
SHA256 9f0951c6d87d92a6a80c7045c0c444710d6f05057021f8a49563d0a618185128
MD5 71f9be382097ed2e407faff1c9b1cbfe
BLAKE2b-256 49eee1a218194f04ec98abef3b3ad9e14cc7c81007d37a9570f9b2364c9c661e

See more details on using hashes here.

File details

Details for the file adamw_bf16-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: adamw_bf16-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 16.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for adamw_bf16-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7d54056eba69c15ec122a1215ba0b55b461d87d106b5a2c99ca942172402e244
MD5 db2a23978d6d97fdc64e3f9bcc9e7017
BLAKE2b-256 d351284cd3d33302bca9dc1a2a76e7acb6bb4868651989f264c3432e80affc85

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page