Skip to main content

This module enables you to quickly convert your jupyter notebook into a bundle of files that can be run on BCS'*OpenMind*

Project description

NBX: Notebook Experiments for OpenMind

This module enables you to quickly convert your jupyter notebook into a bundle of files that can be run on BCS'OpenMind.

Getting started

Prerequisites

Install the package:

  • pip install nbx

Get a singulartiy image

You'll need an image that has the package installed (there are ways around that, but I am keeping it simple at the moment). Here's how you can build an image:

module load openmind/singularity/3.2.0
export SINGULARITY_CACHEDIR="/om2/user/{your_user_name}/.singularity"
singularity build pytorch.simg docker://mklukas/pytorch

Environment variables

For the modules to work you have to set the environment variables om, omx, omsimg, and omid:

  • om: your login to OpenMind.
    • You need to enable logging into OpenMind using public key authentication. That means the command ssh $om should log you in whithout asking for a password. (googling for "ssh public key authentication" will provide you with a recipe like this)
  • omx: path to the folder where nbx bundles are stored. This path will automatically be added to your python path. Any modules that are not part of your bundle's src/ folder or are included in your singularity container should go here.
  • omsimg: path to the folder containing your singularity images
  • omid: your Open Mind username

Mac users can adapt and copy the following lines to their .bash_profile file

export om={your_user_name}@openmind7.mit.edu
export omid={your_user_name}
export omx=/om2/user/{your_user_name}/nbx-experiments
export omsimg=/om2/user/{your_user_name}/simg

Usage

  • Put your python scripts that you wrote for this experiment in a folder called src. The folder will be copied to the bundle so the scripts are available on OpenMind as well.
  • #nbx: Each cell that contains a #nbx tag in its first line will be considered part of the experiment.
  • #xarg: Putting #xarg above a variable declaration makes this variable explicit, it will become an argument of the experiment function. Any iterable to the right of the variable declaration, separated by a semicolon, will be considered the domain that will be swept during the parameter sweep.
  • Each nbx-experiment has to declare the variables task_id and results_dir. The task id will be set by the wrapper script and enumerates the configurations of the parameter space. The latter variable will also be set by the wrapper script, it will be replaced by the folder automatically created for a specific parameter configuration.

Example

Experiment

In every experiment we need to indicate which cells are part of it (using the #nbx flag), and need to specify these two arguments:

  • task_id
  • results_dir
%matplotlib inline
%load_ext autoreload
%autoreload 2
#nbx
#xarg
task_id = 0
#xarg
results_dir = "."

This cell will be part of the experiment

#nbx

#xarg
x=0; range(5)

#xarg
y=0; [0,1,2,4]

z=0;

# ...

This cell will also be part of the experiment. The output will be written to a log file in the io folder that will automatically be created.

#nbx
print("my results:", x, y, z)
my results: 0 0 0

Note how we used the variable results_dir. It will will be replaced by "results/task_id/"; a corresponding folder will automatically be created. It is really just a hook so we can manipulate it behind the scenes.

#nbx
with open(f"{results_dir}/your_file.txt", "w") as f:
    f.write("I will be written to: example_nbx_bundle/results/task_id/your_file.txt")
    f.write(f"\n{task_id}")

Creating and running an NBX bundle

To run the experiment on OM we have to create a bundle that we can interact with...

from nbx.om import NbxBundle

bundle = NbxBundle(nbname="index.ipynb", # the name of the notebook to use as exp
          name="example_bundle",         # name of the bundle
          linting=False,                 # enable basic linting
          time=[0,20],                   # comp time [hours, minutes]
          ntasks=4,                      # requested comp nodes
          step=50,                       # parallel jobs (compare bundle/run.sh)
          max_arr=10,                    # maximum number of queued jobs on OM is 1000
          mail_user="me@somewhere.com",  # notification email
          simg="pytorch.simg")           # singulrity img on OM in $omsimg
** nbx bundle created **
Path:
    example_bundle_nbx

Source nb:
    index.ipynb

Parameters (#configs 20):
    * x = range(5)
    * y = [0,1,2,4]
      task_id = 0
      results_dir = "."

Instructions:
    Copy to remote, run the bash script, and pull the results
    - `bundle.push()` or `scp -r example_bundle_nbx $om:$omx`
    - `bundle.run()` or `ssh $om sbatch -D $omx/example_bundle_nbx $omx/example_bundle_nbx/run.sh`
    - `bundle.pull_results()` or `scp -r $om:$omx/example_bundle_nbx/results ./results`
!ls example_bundle_nbx/
__init__.py   experiment.py job.sh        wrapper.py
__pycache__   io            run.sh
from example_bundle_nbx.experiment import sweep_params as sweep

print(len(sweep))
print(sweep[0])
print(sweep[1])
print(sweep[4])
12
{'x': 0, 'y': 0}
{'x': 0, 'y': 1}
{'x': 1, 'y': 0}
bundle.push()
bundle.run()
bundle.status()
bundle.pull_results()

The results are now in the local folder:

!ls example_bundle_nbx

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nbx-0.3.0.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nbx-0.3.0-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file nbx-0.3.0.tar.gz.

File metadata

  • Download URL: nbx-0.3.0.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.6.9

File hashes

Hashes for nbx-0.3.0.tar.gz
Algorithm Hash digest
SHA256 d2a6f0149daed847875565447d6718e5e4f06a3ee46a9926a635da3501e60977
MD5 81674e2301bea562a841f42a9d2f8cc8
BLAKE2b-256 b863de8a35d23e2e18f227d81544810ca09b155033d366bec19d0d4741253685

See more details on using hashes here.

File details

Details for the file nbx-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: nbx-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 30.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.6.9

File hashes

Hashes for nbx-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b8601f1c931372a7345a6ddcc0042b2914ae99a8170d8c86c72749f86cfbf0b8
MD5 d801a965e5795d45ae7d6042e1d10ee4
BLAKE2b-256 5babad5c88984d5abd311d6d47321c74558db0392b4208c10254d17a4cc34e5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page