Skip to main content

Efficient asynchronous optimization of expensive black-box functions on top of Apache Spark

Project description

Downloads PypiStatus PythonVersions Docs

Maggy is a framework for efficient asynchronous optimization of expensive black-box functions on top of Apache Spark. Compared to existing frameworks, maggy is not bound to stage based optimization algorithms and therefore it is able to make extensive use of early stopping in order to achieve efficient resource utilization.

Right now, maggy supports asynchronous hyperparameter tuning of machine learning and deep learning models, and ablation studies on neural network layers as well as input features.

Moreover, it provides a developer API that allows advanced usage by implementing custom optimization algorithms and early stopping criteria.

To accomodate asynchronous algorithms, support for communication between the Driver and Executors via RPCs through Maggy was added. The Optimizer that guides hyperparameter search is located on the Driver and it assigns trials to Executors. Executors periodically send back to the Driver the current performance of their trial, and the Optimizer can decide to early-stop any ongoing trial and send the Executor a new trial instead.

Quick Start

To Install:

>>> pip install maggy

The programming model consists of wrapping the code containing the model training inside a function. Inside that wrapper function provide all imports and parts that make up your experiment.

There are three requirements for this wrapper function:

  1. The function should take the hyperparameters as arguments, plus one additional parameter reporter which is needed for reporting the current metric to the experiment driver.

  2. The function should return the metric that you want to optimize for. This should coincide with the metric being reported in the Keras callback (see next point).

  3. In order to leverage on the early stopping capabilities of maggy, you need to make use of the maggy reporter API. By including the reporter in your training loop, you are telling maggy which metric to report back to the experiment driver for optimization and to check for global stopping. It is as easy as adding reporter.broadcast(metric=YOUR_METRIC) for example at the end of your epoch or batch training step and adding a reporter argument to your function signature. If you are not writing your own training loop you can use the pre-written Keras callbacks in the maggy.callbacks module.

Sample usage:

>>> # Define Searchspace
>>> from maggy import Searchspace
>>> # The searchspace can be instantiated with parameters
>>> sp = Searchspace(kernel=('INTEGER', [2, 8]), pool=('INTEGER', [2, 8]))
>>> # Or additional parameters can be added one by one
>>> sp.add('dropout', ('DOUBLE', [0.01, 0.99]))
>>> # Define training wrapper function:
>>> def mnist(kernel, pool, dropout, reporter):
>>>     # This is your training iteration loop
>>>     for i in range(number_iterations):
>>>         ...
>>>         # add the maggy reporter to report the metric to be optimized
>>>         reporter.broadcast(metric=accuracy)
>>>         ...
>>>     # Return the same final metric
>>>     return accuracy
>>> # Launch maggy experiment
>>> from maggy import experiment
>>> result = experiment.lagom(map_fun=mnist,
>>>                            searchspace=sp,
>>>                            optimizer='randomsearch',
>>>                            direction='max',
>>>                            num_trials=15,
>>>                            name='MNIST'
>>>                           )

lagom is a Swedish word meaning “just the right amount”. This is how maggy uses your resources.

MNIST Example

For a full MNIST example with random search using Keras, see the Jupyter Notebook in the examples folder.

Documentation

Read our blog post for more details.

API documentation is available here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maggy-0.3.2.tar.gz (35.3 kB view details)

Uploaded Source

Built Distribution

maggy-0.3.2-py3-none-any.whl (56.8 kB view details)

Uploaded Python 3

File details

Details for the file maggy-0.3.2.tar.gz.

File metadata

  • Download URL: maggy-0.3.2.tar.gz
  • Upload date:
  • Size: 35.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for maggy-0.3.2.tar.gz
Algorithm Hash digest
SHA256 8ca3f1e96cd5fa9aaf216983ef652ccb75614e1fc691356a6ab55a607e039082
MD5 29742922bb98874a7f65bb20703feff0
BLAKE2b-256 05687914d74fbce4ad4e48fd0d8292a2b9c94c0f70048d5e97862997473be552

See more details on using hashes here.

File details

Details for the file maggy-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: maggy-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 56.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for maggy-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a9219aa55d38c5b694b350c31ef797fa2d0e9e138e571caaecea9cd216fa3315
MD5 bda334f3df7a2a8d9c914779140e7df5
BLAKE2b-256 5492fb853348295eb09e6688ff05a46121238dd1c7dd3a0677ab1f0302e96e7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page