Skip to main content

black box tuning of optimizers

Project description

Mechanic: black-box tuning of optimizers

PyPI - Version PyPI - Python Version


Based on the paper: https://arxiv.org/abs/2306.00144

Be aware that all experiments reported in the paper were run using the JAX version of mechanic, which is available in optax via optax.contrib.mechanize.

Mechanic aims to remove the need for tuning a learning rate scalar (i.e. the maximum learning rate in a schedule). You can use it with any pytorch optimizer and schedule. Simply replace:

optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

with:

from mechanic_pytorch import mechanize
optimizer = mechanize(torch.optim.SGD)(model.parameters(), lr=1.0)
# you can set the lr to anything here, but excessivel small values may cause numerical precision issues.

That's it! The new optimizer should no longer require tuning the learning rate scale! That is, the optimizer should now be very robust to heavily mis-specified values of lr.

Installation

pip install mechanic-pytorch

Note that the package name is mechanic-pytorch, but you should import mechanic_pytorch (dash replaced with underscore).

Options

It is possible to play with the configuration of mechanic, although this should be unecessary:

optimizer = mechanize(torch.optim.SGD, s_decay=0.0, betas=(0.999,0.999999), store_delta=False)(model.parameters(), lr=0.01)
  • The option store_delta=False is set to minimize memory usage. An minimum we currently keep one extra "slot" of memory (i.e. an extra copy of the weights). If you are ok keeping one more copy, you can set store_delta=True. This will make the first few iterations have a slightly more accurate update, and usually has negligible effect.
  • The option s_decay is a bit like a weight-decay term that empirically is helpful for smaller datasets. We use a default of 0.01 in all our experiments. For larger datasets, smaller values (even 0.0) often worked as well.
  • The option betas is a list of exponential weighting factors used internally in mechanic. They are NOT related to beta values found in Adam. In theory, it should be safe to provide a large list of possibilities here. The default settings of (0.9,0.99,0.999,0.9999,0.99999,0.999999) seem to work will in a range of tasks.
  • s_init is the initial value for the mechanic learning rate. It should be an underestimate of the correct learning rate, and it can safely be set to a very small value (default 1e-8), although it cannot be set to zero. In particular, the theoretical analysis of mechanic includes a log(1/s_init) term. This is very robust to small values, but will eventually blow up if you make s_init absurdly small.

License

mechanic is distributed under the terms of the Apache-2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mechanic_pytorch-0.0.1.tar.gz (17.1 MB view details)

Uploaded Source

Built Distribution

mechanic_pytorch-0.0.1-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file mechanic_pytorch-0.0.1.tar.gz.

File metadata

  • Download URL: mechanic_pytorch-0.0.1.tar.gz
  • Upload date:
  • Size: 17.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for mechanic_pytorch-0.0.1.tar.gz
Algorithm Hash digest
SHA256 10e16464c4764ce4e4ade92dd24eed985d5fc5e5cc664858b719ed9fe002a25d
MD5 2f110839be85df84e4f28752c87621e5
BLAKE2b-256 6dc30b2fea755817314598e57eff009abb1a30680dd9c4636f5e13be46b7c776

See more details on using hashes here.

File details

Details for the file mechanic_pytorch-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mechanic_pytorch-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e1fd7bf2953c1ba6c7ef3bee371740159647015113eaf033847456774052f299
MD5 46b698080846250ca6e7f96637cee19b
BLAKE2b-256 14ac9d33a19cb6b99b7b8c10c62b37f3944377cd0a147848e288dd26ee9416df

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page