Skip to main content

Tracking, queueing and distributed execution of ML/DL experiments. Helping define and semi-automatically explore hyper-parameter spaces.

Project description

Hyperspace explorer

Collection of tools meant to enable faster progress and reproducibility of results in ML/DL projects and competitions, without assuming too much about the projects themselves and used tools.

Introduction

What is it useful for?

Meant to support a process where we have:

  • a dataset, possibly pre-processed to some degree, that can be deterministically loaded (multiple versions of data? loading code should take an argument specifying which version to load)
  • an objective - we only consider supervised ML problems here, and we need some quality metric (be it accuracy or AUROC of classification, RMSE of regression, or something else). The metric can be changed, multiple metrics also supported.

A common, useful approach is to then to quickly develop a simple version of the full pipeline:

  1. loading the data
  2. pre-processing, data augmentation, etc
  3. the model logic (if not a ready-made solution - perhaps a custom neural architecture?)
  4. model training procedure
  5. scoring the model on the validation set

Then the real work begins - multiple iterations of:

  1. developing and testing new versions of parts of the pipeline - adding dimensions to the hyper-parameter space (e.g. different neural architecture, different kind of model altogether, different pre-trained network, different data pre-processing logic),
  2. optimizing hyper-parameters,
  3. inspecting how/when the model fails, and how it works - to guide further development,
  4. (in some cases) periodically pushing new versions to production.

This package is meant to provide common utilities supporting workflows like those, regardless of the frameworks/libraries in use or the structure of the problem to solve. It was born from common parts of multiple projects, often having a "weird" structure - e.g.:

  • fitting multiple models with identical hyper-parameters to time-series from corresponding machines,
  • periodic re-training of the model (also for time-series)
  • testing the same approach on multiple subsets of the training set

The strength of the package lies in its modularity, minimal assumptions about what you need to do and how you need to do it. It is meant to get out of your way - your code should still work without it, ran the way you need - be it CLI, a notebook, etc.

Goals

It should be possible to:

  • store and analyze results of all past experiments,
  • reproduce past results (-> store results together with hyper-parameters, data, versions of code, ...),
  • easily re-run past experiments with some hyper-parameters changed,
  • develop or inspect parts of the pipeline in a notebook, easily run all of it from there (TODO: instructions),
  • extend the code's functionality and the hyper-parameter space; new hyper-params should always be added with default values matching previous behaviour. Replacing pieces of logic with different ones, with different hyper-parameters should be handled as well.
  • queue runs from a notebook (e.g. generate a grid of HPs in some dimensions) and have them executed by worker processes (possibly distributed among multiple machines). Then close the notebook, or add some more runs, never again waiting for computations to finish.
  • install core parts of the project's package and use it in production code (if the code got ugly with too many options, write a new version of it, and compare results with the old version, all in the same environment)
  • use automatic hyper-parameter tuning algorithms, informed by all past experiments during development.

Glossary

  • Run - single execution of a Task with a specific config (set of parameters), concluded by calculation of a quality metric.
    Example: training a classifier with the hyper-parameters provided in the config, calculating accuracy on the validation set.
    Each run gets its entry in a selected MongoDB database, in the runs collection. It includes all parameters, code versions, result metrics, captured output, and possibly custom diagnostic information.
  • Task - fully specified problem to solve, and then compare different solutions to.
    Consists of a chosen Scenario class and parameter values for its constructor. Each task should be specified as a .json file and include keys: Scenario (= dictionary of its settings, starting with className) and optionally seed.
  • Scenario - template of a Task without its parameters. When parameterized by Task parameters and a Run config it can be executed, and should return a quality metric. Each Scenario is a class, inheriting from scenario_base.Scenario. Its .single_run() method typically contains dataset construction, model construction, model training and evaluation.
  • Study - a project in which we define one or more Scenarios to study solutions for, with a shared codebase, database, and virtual environment. Corresponds to a Python package, which has a dependency on hyperspace_explorer and includes a scenarios module.

Setup

Installation

Install with pip, as a normal python package.

Other than all the modules being made available for import, it will also make hyperspace_worker.py available in your system's PATH (or a specific virtual environment).

To use most of this package's functions a running instance of MongoDB will be needed.

Project structure

TODO

Adding new parameters, default values

As we add new parameters to our code (our classes), we must provide default values that ensure behaviour consistent with the earlier versions of the code.

To achieve that, default values (they do not have to be optimal/recommended ones!) live inside the classes. They must not be set in function - instead, they should be provided by the .get_default_values() method of each Configurable. If using dataclasses, that method is provided automatically.

When a worker process starts to process a Run, it fills in all default values

  • they end up stored in the database. It is redundant, but makes result analysis much easier - they do not have to be aware of the Study-specific codebase.

If expanding a Scenario to fit more Tasks, and adding parameters to it, similarly default values should be provided for all of them.

A tool for back-filling new default values to past runs (while keeping original config in a different field as backup?) would be useful - future development.

Usage

Running a worker

Run a command hyperspace_worker.py [path to tasks dir] [mongo db name].

Important: it has to be ran from a directory containing a scenarios.py module, which defines experiment scenarios allowed to be ran within the given project. The hyperspace_worker.py file should not be present in the folder.

Arguments:

  • path to tasks dir - path to a directory containing .json files, describing each allowed task - a parameterization of a scenario
  • mongo db name - name of the mongoDB database to store results in
  • optional params: mongoDB URI (if not localhost, or if password is required), interval to query for new tasks

Browsing experiment results

This project (ab)uses Sacred to collect and store information about each run.

One of the benefits: we can use many ready-made dashboards for Sacred, e.g. Omniboard - highly recommended, works out of the box, many impressive features.

Possible access points, usage modes

CLI

TODO

Run queue + workers

Start workers on 1 or more nodes, set them up to use the same database (which also serves as a task queue). Workers are specific to one Study (project) - will only process tasks for the Study they were started for.

Example code, usually ran from a notebook, to submit one task. From here, it is easy to e.g. submit a grid of hyper-parameters for the workers to test.

from hyperspace_explorer.queue import RunQueue
from pathlib import Path

tasks_dir = Path.cwd().resolve().parent / 'tasks'  # just an example - relative to the notebook
db_name = 'ulmfit_attention'
mongo_uri = 'localhost:27017'

q = RunQueue(mongo_uri, db_name, tasks_dir)
task_name = 'imdb_1k_sample_single'

conf = {
    'aggregation': {  # different additional parameters are available depending on `className`
        'className': 'BranchingAttentionAggregation',
        'agg_layers': [50, 10]
    },
    'classifier': { # this dict is passed to a specific function within a scenario, but polymorphism is not needed
        'lin_ftrs': [],
        'drop_mult': 0.5,
    },
    'training_schedule': { # even if we do not want to change any default parameters, className is required
        'className': 'DefaultSchedule',
    }
}
q.submit(task_name, conf)

The code above works with the project: https://github.com/tpietruszka/ulmfit_attention. In this case workers should be ran from within the inner ulmfit_attention directory.

Interactive prototyping in Jupyter

TODO

Running tests

TODO

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyperspace_explorer-0.3.1.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

hyperspace_explorer-0.3.1-py3-none-any.whl (19.7 kB view details)

Uploaded Python 3

File details

Details for the file hyperspace_explorer-0.3.1.tar.gz.

File metadata

  • Download URL: hyperspace_explorer-0.3.1.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for hyperspace_explorer-0.3.1.tar.gz
Algorithm Hash digest
SHA256 85e95c88efaf5529fa0eea0cd1e442f51c74a7f439f5d73384380a397f74091b
MD5 37222cfb9d36aa4bcfe677e70729d2cd
BLAKE2b-256 bc569a85f1abeeba99d9e3b0a314d6d417ec97d63aba68b604de86a6274a5289

See more details on using hashes here.

File details

Details for the file hyperspace_explorer-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: hyperspace_explorer-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 19.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for hyperspace_explorer-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ec3c5670617c86f72328b843c80924aa7d6c325c0637b4d3858e6e743d40b96d
MD5 dea151d038311d4ccae18029f7428ede
BLAKE2b-256 ed1f3ba5b388c241fb8cb5a1f4cfb09f76dac0844caadc12dba5b20a78832011

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page