Skip to main content

Combining papermill and wandb sweeps for frictionless experiments with Notebooks

Project description

# Papersweep > Combining papermill and wandb sweeps for frictionless experiments with notebooks.

I often find myself coding a machine learning experiment in a Jupyter Notebook, using [wandb](https://www.wandb.com/) to visualize and track the results of the runs. When the experiment is drafted, I always have questions such as: How will the performance be affected by the parameter a? What if I change the number of items of the dataset, or change the dataset completely?

[wandb sweeps](https://docs.wandb.com/sweeps) are a great tool to solve these questions. However, sweeping requires that you define a specific training function for the sweep, which I find redundant, specially when the code for training is already in the Jupyter Notebook. Furthermore, if I make some changes in the original notebook, I have to be sure that I change the sweep function too.

This library just provides a single command, papersweep, which uses the library [papermill](https://github.com/nteract/papermill) to execute a notebook as the function of a wandb sweep. The only thing that has to be changed in the notebook is the way the config parameters are declared in wandb.config.

As an example, if a is a parameter in your notebook declared as:

`python wandb.config.a = 3 `

Just changing that line to:

`python wandb.config.a = ifnone(wandb.config.get('a'), 3) `

will use the default value 3 in case the notebook is executed as a standalone run (i.e, without a sweep), and in case the notebook is executed as a sweep function, it will use the value injected from the sweep configuration. This provides a frictionless way of using your Jupyter Notebooks both for single runs and sweep functions.

## Install

pip install papersweep

## How to use

–help provides command help.

` $ `

input_nb contains a path to the notebook with the experiment you want to use as function of the sweep.

sweep_config is a path to a YAML file with the configuration of the sweep. An example is given in examples/sweep_config.yaml. More information aboyt sweep configurations in the [official docs](https://docs.wandb.com/sweeps/configuration).

pm_params is a YAML file with extra configuration for the notebook execution aside from the sweep parameters. Those parameters will be injected in the notebook by papermill, so they need to be placed in one cell tagged as parameters (See the documentation of papermill to see how to tag a cell in a Jupyter Notebook).

sweep_id allows to reuse an already existing sweep instead of creating a new one.

## Example with time series classification

The notebook _example_tsai.ipynb trains a time series classifier using deep learning with the library tsai`(https://github.com/timeseriesAI/tsai). The dataset (`dsid) and the deep learning architecture (arch) are part of the wandb.config configuration parameters.

The file examples/sweep_config.yaml gives grid-like experiment in which multiple datasets and architectures are tried, looking for the combination that achieves better accuracy

Running the following commandin a terminal:

papersweep –input_nb ./_example_tsai.ipynb –sweep_config ./examples/sweep_config.yaml —entity vrodriguezf –project papersweep

will run the notebook ./example_tsai.ipynb once for every iteration of the sweep, and log the results [in a dashboard](https://wandb.ai/vrodriguezf/papersweep/sweeps/qh09r37b?workspace=user-vrodriguezf) that you can interact and play with.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

papersweep-0.0.3.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

papersweep-0.0.3-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file papersweep-0.0.3.tar.gz.

File metadata

  • Download URL: papersweep-0.0.3.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.6.9

File hashes

Hashes for papersweep-0.0.3.tar.gz
Algorithm Hash digest
SHA256 44274b5dab660bce688c6a44b6de2bcb52722c91ea732c6e56615d5101d64e38
MD5 81cdee9525a7867935a4181fec023b12
BLAKE2b-256 a6009c0576be2dd24eb9649915c461b3ad21498f22bbdd4fd23710a055334f4d

See more details on using hashes here.

File details

Details for the file papersweep-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: papersweep-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.6.9

File hashes

Hashes for papersweep-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 bb1820223d93a90bedccbc23da0f353a9db062a68c49cc397bbc3b5fcbeb2dd1
MD5 c9e680939e3994501ab13ae49dc6ac05
BLAKE2b-256 7b53396e84efd89520fc7c0c127577f8b2c71b5cee2ff881ef11d850a6cfc13c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page