Skip to main content

This module enables you to quickly convert your jupyter notebook into a bundle of files that can be run on BCS'*OpenMind*

Project description

NBX: Notebook Experiments for OpenMind

This module enables you to quickly convert your jupyter notebook into a bundle of files that can be run on BCS'OpenMind.

Getting started

Prerequisites

Install the package:

  • pip install nbx

Get a singulartiy image

You'll need an image that has the package installed (there are ways around that, but I am keeping it simple at the moment). Here's how you can build an image:

module load openmind/singularity/3.2.0
export SINGULARITY_CACHEDIR="/om2/user/{your_user_name}/.singularity"
singularity build pytorch.simg docker://mklukas/pytorch

Environment variables

For the modules to work you have to set the environment variables om, omx, omsimg, and omid:

  • om: your login to OpenMind.
    • You need to enable logging into OpenMind using public key authentication. That means the command ssh $om should log you in whithout asking for a password. (googling for "ssh public key authentication" will provide you with a recipe like this)
  • omx: path to the folder where nbx bundles are stored. This path will automatically be added to your python path. Any modules that are not part of your bundle's src/ folder or are included in your singularity container should go here.
  • omsimg: path to the folder containing your singularity images
  • omid: your Open Mind username

Mac users can adapt and copy the following lines to their .bash_profile file

export om={your_user_name}@openmind7.mit.edu
export omid={your_user_name}
export omx=/om2/user/{your_user_name}/nbx-experiments
export omsimg=/om2/user/{your_user_name}/simg

Usage

  • Put your python scripts that you wrote for this experiment in a folder called src. The folder will be copied to the bundle so the scripts are available on OpenMind as well.
  • #nbx: Each cell that contains a #nbx tag in its first line will be considered part of the experiment.
  • #xarg: Putting #xarg above a variable declaration makes this variable explicit, it will become an argument of the experiment function. Any iterable to the right of the variable declaration, separated by a semicolon, will be considered the domain that will be swept during the parameter sweep.
  • Each nbx-experiment has to declare the variables task_id and results_dir. The task id will be set by the wrapper script and enumerates the configurations of the parameter space. The latter variable will also be set by the wrapper script, it will be replaced by the folder automatically created for a specific parameter configuration.

Example

Experiment

In every experiment we need to indicate which cells are part of it (using the #nbx flag), and need to specify these two arguments:

  • task_id
  • results_dir
%matplotlib inline
%load_ext autoreload
%autoreload 2
#nbx
#xarg
task_id = 0
#xarg
results_dir = "."

This cell will be part of the experiment

#nbx

#xarg
x=0; range(5)

#xarg
y=0; [0,1,2,4]

z=0;

# ...

This cell will also be part of the experiment. The output will be written to a log file in the io folder that will automatically be created.

#nbx
print("my results:", x, y, z)
my results: 0 0 0

Note how we used the variable results_dir. It will will be replaced by "results/task_id/"; a corresponding folder will automatically be created. It is really just a hook so we can manipulate it behind the scenes.

#nbx
with open(f"{results_dir}/your_file.txt", "w") as f:
    f.write("I will be written to: example_nbx_bundle/results/task_id/your_file.txt")
    f.write(f"\n{task_id}")

Creating and running an NBX bundle

To run the experiment on OM we have to create a bundle that we can interact with...

from nbx.om import NbxBundle

bundle = NbxBundle(nbname="index.ipynb", # the name of the notebook to use as exp
          name="example_bundle",         # name of the bundle
          linting=False,                 # enable basic linting
          time=[0,20],                   # comp time [hours, minutes]
          ntasks=4,                      # requested comp nodes
          step=50,                       # parallel jobs (compare bundle/run.sh)
          max_arr=10,                    # maximum number of queued jobs on OM is 1000
          mail_user="me@somewhere.com",  # notification email
          simg="pytorch.simg")           # singulrity img on OM in $omsimg
** nbx bundle created **
Path:
    example_bundle_nbx

Source nb:
    index.ipynb

Parameters (#configs 20):
    * x = range(5)
    * y = [0,1,2,4]
      task_id = 0
      results_dir = "."

Instructions:
    Copy to remote, run the bash script, and pull the results
    - `bundle.push()` or `scp -r example_bundle_nbx $om:$omx`
    - `bundle.run()` or `ssh $om sbatch -D $omx/example_bundle_nbx $omx/example_bundle_nbx/run.sh`
    - `bundle.pull_results()` or `scp -r $om:$omx/example_bundle_nbx/results ./results`
!ls example_bundle_nbx/
__init__.py   experiment.py job.sh        wrapper.py
?[34m__pycache__?[m?[m   ?[34mio?[m?[m            run.sh
from example_bundle_nbx.experiment import sweep_params as sweep

print(len(sweep))
print(sweep[0])
print(sweep[1])
print(sweep[4])
12
{'x': 0, 'y': 0}
{'x': 0, 'y': 1}
{'x': 1, 'y': 0}
bundle.push()
bundle.run()
bundle.status()
bundle.pull_results()

The results are now in the local folder:

!ls example_bundle_nbx

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for nbx, version 0.3.0
Filename, size File type Python version Upload date Hashes
Filename, size nbx-0.3.0-py3-none-any.whl (30.9 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size nbx-0.3.0.tar.gz (15.4 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page