Skip to main content

AdamW Optimizer for bfloat16

Project description

AdamW optimizer for bfloat16 in PyTorch

This is a version of the AdamW optimizer for use in torch that achieves the same results in ViT training tests as training with the weights in float32 with operations in float32 or bfloat16 (autocast). By keeping your weights in bfloat16, you can save approximately half the weights they would normally take up in memory. It uses stochastic rounding and a correction term to achieve this.

There is a small (~10-20%) performance hit depending on your hardware.

To install:

pip install adamw-bf16

To use:

from adamw_bf16 import AdamWBF16

model = model.to(dtype=torch.bfloat16)
optimizer = AdamWBF16(model.parameters(), ...)

# Train your model

This repository was created using code from the following two projects. It was found that insights from both could be combined to match the performance with the model weights stored in float32.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adamw_bf16-0.0.3.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

adamw_bf16-0.0.3-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file adamw_bf16-0.0.3.tar.gz.

File metadata

  • Download URL: adamw_bf16-0.0.3.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for adamw_bf16-0.0.3.tar.gz
Algorithm Hash digest
SHA256 826b046e23ebd34ab1fd0e8700ea4e3f419d0d66e6211b814cf120440796cc0a
MD5 6514cf699d93d433eedb7a03434c9a7a
BLAKE2b-256 2d2f5eb18ee927565eb420583a7b5c95779e196b3e14b958e98a1ee44bfe2b18

See more details on using hashes here.

File details

Details for the file adamw_bf16-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: adamw_bf16-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for adamw_bf16-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6795187b69b92cc962d0b02471173b997e6fb15b9ffc36825e9de3a207bebeec
MD5 8978d444b327d2c8600a6a452a5d5ea5
BLAKE2b-256 7c4c195a2835e02f862b6fe4d7067142ce66cde47a45274840c4e1b4b81fcc18

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page