Skip to main content

labs - a dask-distributed experiments manager

Project description

Introduction

Labs helps create and define (config file), execute (scale with Dask) and save (artifacts, results, metadata) experiments. It's main purpose is to execute ML experiments, but can be used for other use cases.

Labs is using Dask delayed lazy API for distributed computation. Additionally, Scikit-Learn is also used in the [Searcher] module.

Disclaimer: Labs is currently experimental and for my own personal use.

Key concepts:

  • [Experiment Design] - a user defined experiment. The [Experiment Design] is being expressed by a func, which will be executed by an [Experimenter/s].

  • [Experiment] - a combination of hyper parameters to be tested while running [Experiment Design].

  • [Experiment Run] - using the [Experiment Configuration] and [Experiment Design], numerous [Experiments] will be executed. The [Experiment Run] will output best [Experiment] (best hyper parameters combination).

  • [Experiment Configuration] - sets of configurations which will define the [Experiments] to be executed in Experiment Run.

  • [LabManager] - running all the [Experiments Configurations] as defined in a config file. A [LabManager] can perform numerous [Experiment Configuration] and [Experiment Design]

  • [Experimenter] - an entity which perform the tuning/experimenting process.

  • [Searcher] - an entity used by an [Experimenter] to create the [Experiments] in Experiment Run. Example Searchers: Grid Search, Random Sampling, Bayesian Search (with the great skopt package). The [Searcher] use the defined space in [Experiment Configuration].

1. Installation process

pip install labs

2. Docs

(Documentation is not completed yet)

  1. Quick Start
  2. Experimenters
  3. Searchers
  4. LabManager
  5. Live Reporting
  6. Configs
  7. Suggested Steps

3. Future

Currently, the project is very new and not completed.

The project need more development to support distributed computation options. The future plan is to use Dask rich and developed ecosystem, for simple and fast development of distributed computation options.

Future developments:

  • pytest testing.
  • Flow options - checkpoint saving, time caps, delta improvement and more.
  • Docker support.
  • Kubernetes support.
  • Experiments Artifact saved in cloud storage options.
  • MLFlow interaction.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

labs-0.0.2.tar.gz (13.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page