Skip to main content

A library for unit scaling in PyTorch, based on the paper 'u-muP: The Unit-Scaled Maximal Update Parametrization.'

Project description

Unit-Scaled Maximal Update Parameterization (u-μP)

tests PyPI version license GitHub Repo stars

A library for unit scaling in PyTorch, based on the paper u-μP: The Unit-Scaled Maximal Update Parametrization and previous work Unit Scaling: Out-of-the-Box Low-Precision Training.

Documentation can be found at https://graphcore-research.github.io/unit-scaling and an example notebook at examples/demo.ipynb.

Note: The library is currently in its beta release. Some features have yet to be implemented and occasional bugs may be present. We're keen to help users with any problems they encounter.

Installation

To install the unit-scaling library, run:

pip install unit-scaling

or for a local editable install (i.e. one which uses the files in this repo), run:

pip install -e .

Development

For development in this repository, we recommend using the provided docker container. This image can be built and entered interactively using:

docker build -t unit-scaling-dev:latest .
docker run -it --rm  --user developer:developer -v $(pwd):/home/developer/unit-scaling unit-scaling-dev:latest
# To use git within the container, add `-v ~/.ssh:/home/developer/.ssh:ro -v ~/.gitconfig:/home/developer/.gitconfig:ro`.

For vscode users, this repo also contains a .devcontainer.json file, which enables the container to be used as a full-featured IDE (see the Dev Container docs for details on how to use this feature).

Key development functionality is contained within the ./dev script. This includes running unit tests, linting, formatting, documentation generation and more. Run ./dev --help for the available options. Running ./dev without arguments is equivalent to using the --ci option, which runs all of the available dev checks. This is the test used for GitHub CI.

We encourage pull requests from the community. Please reach out to us with any questions about contributing.

What is u-μP?

u-μP inserts scaling factors into the model to make activations, gradients and weights unit-scaled (RMS ≈ 1) at initialisation, and into optimiser learning rates to keep updates stable as models are scaled in width and depth. This results in hyperparameter transfer from small to large models and easy support for low-precision training.

For a quick intro, see examples/demo.ipynb, for more depth see the paper and library documentation.

What is unit scaling?

For a demonstration of the library and an overview of how it works, see Out-of-the-Box FP8 Training (a notebook showing how to unit-scale the nanoGPT model).

For a more in-depth explanation, consult our paper Unit Scaling: Out-of-the-Box Low-Precision Training.

And for a practical introduction to using the library, see our User Guide.

License

Copyright (c) 2023 Graphcore Ltd. Licensed under the Apache 2.0 License.

See NOTICE.md for further details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unit_scaling-0.3.5.tar.gz (4.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unit_scaling-0.3.5-py3-none-any.whl (78.7 kB view details)

Uploaded Python 3

File details

Details for the file unit_scaling-0.3.5.tar.gz.

File metadata

  • Download URL: unit_scaling-0.3.5.tar.gz
  • Upload date:
  • Size: 4.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for unit_scaling-0.3.5.tar.gz
Algorithm Hash digest
SHA256 9df186addff1bca39e888f00fb51cda3e7a54440eb478b203cc585442ba15016
MD5 3ce910f1e3887d9cb651ed36772d64cc
BLAKE2b-256 7d3a6e88c1d682900d05f5b2564cd45520d82a6fb75e2e7f5c41760ed49db6dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for unit_scaling-0.3.5.tar.gz:

Publisher: publish.yaml on graphcore-research/unit-scaling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file unit_scaling-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: unit_scaling-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 78.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for unit_scaling-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 53c270afb1e8bf804a39612b544dafedf1065b0cbf79820296d843392bcfac2c
MD5 be20902876c4811fca7fb9ae9ef498ab
BLAKE2b-256 cebabc682ddafc853e7d9f4ad89384679b9bc5052a5c44fbc52aa0caad591d25

See more details on using hashes here.

Provenance

The following attestation bundles were made for unit_scaling-0.3.5-py3-none-any.whl:

Publisher: publish.yaml on graphcore-research/unit-scaling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page