Skip to main content

A proactive kernel for Jupyter

Project description

proactive-jupyter-kernel

The ActiveEon Jupyter Kernel adds a kernel backend to Jupyter to interface directly with the ProActive scheduler and construct tasks and workflows to execute them on the fly.

1. Requirements:

Python 2 or 3

2. Installation:

2.1 Using PyPi

  • open a terminal

  • install the proactive jupyter kernel

$ pip install proactive proactive-jupyter-kernel --upgrade
$ python -m proactive-jupyter-kernel.install

2.2 Using source code

  • open a terminal

  • clone the repository on your local machine:

$ git clone git@github.com:ow2-proactive/proactive-jupyter-kernel.git
  • install the proactive jupyter kernel:
$ pip install proactive-jupyter-kernel/
$ python -m proactive-jupyter-kernel.install

3. Platform

You can use any jupyter platform. We recommend the use of jupyter lab. To launch it from your terminal after having installed it:

$ jupyter lab

or in daemon mode:

$ nohup jupyter lab &>/dev/null &

When opened, click on the Proactive icon to open a notebook based on the proactive kernel.

4. Connect:

4.1 Using connect()

If you are trying proactive for the first time, please sign up on try platform. Once you receive your login and password, connect using the #%connect() pragma:

#%connect(login=YOUR_LOGIN, password=YOUR_PASSWORD)

To connect to another host, please use the later pragma this way:

#%connect(host=YOUR_HOST, port=YOUR_PORT, login=YOUR_LOGIN, password=YOUR_PASSWORD)

4.2 Using config file:

For automatic sign in, create a file named proactive_config.ini in your notebook location.

Fill your configuration file according to the format:

[proactive_server]
host=YOUR_HOST
port=YOUR_PORT
[user]
login=YOUR_LOGIN
password=YOUR_PASSWORD

Save your file changes and restart the proactive kernel.

You can also force the current Kernel to connect using any .ini config file through the #%connect() pragma:

#%connect(path=PATH_TO/YOUR_CONFIG_FILE.ini)

(for more information about this format please check configParser)

5. Usage

5.1 Creating a Python task

To create a task, please use the pragma #%task() followed by the task implementation script wrote into a notebook block code. To use this pragma, at least, a task name has to be provided. Example:

#%task(name=myTask)
print('Hello world')

General usage:

#%task(name=TASK_NAME, [language=SCRIPT_LANGUAGE], [dep=[TASK_NAME1,TASK_NAME2,...]], [generic_info=[(KEY1,VAL1), (KEY2,VALUE2),...]], [export=[VAR_NAME1,VAR_NAME2,...]], [import=[VAR_NAME1,VAR_NAME2,...]], [path=IMPLEMENTATION_FILE_PATH])\n'

As seen in the general usage, users can also provide more information about the task by using the pragma's options:

5.1.1 Language

The language parameter is needed when the task script is not written in native Python, the default language. The handled programming languages are:

  • Linux_Bash
  • Windows_Cmd
  • DockerCompose
  • Scalaw
  • Groovy
  • Javascript
  • Jython
  • Python
  • Ruby
  • Perl
  • PowerShell
  • R

Example of usage when the task is written in Linux_Bash:

#%task(name=myTask, language=Linux_Bash)
echo 'Hello, World!'
5.1.2 Dependencies

One of the most important notions in workflows is the dependencies between tasks. To specify this information, please use the dep parameter. The value should be a list of all the tasks on which the new task depends. Example:

#%task(name=myTask,dep=[parentTask1,parentTask2])
print('Hello world')
5.1.3 Generic information

To specify the values of the advanced ProActive variables "generic_information", the parameter generic_info is provided. The value should be a tuples (key,value) list of all the names and values of the ProActive parameters. Example:

#%task(name=myTask, generic_info=[(var1,value1),(var2,value2)])
print('Hello world')
5.1.4 Export/import variables

The export and import parameters make possible variables propagation between the different tasks of a workflow. If myTask1 variables var1 and var2 are needed in myTask2, the myTask1 pragma should include and export with a list of these variable names and the myTask2 pragma an import with a list including the same names. Example:

myTask1 implementation bloc would be:

#%task(name=myTask1, export=[var1,var2])
var1 = "Hello"
var2 = "ActiveEon!"

and myTask2 implementation bloc would be:

#%task(name=myTask2, dep=[myTask1], import[var1,var2])
print(var1 + " from " + var2)
5.1.5 Implementation file

It is possible to use an external implementation file as task implementation. To do so, the option path should be used. Example:

#%task(name=myTask,path=PATH_TO/IMPLEMENTATION_FILE.py)

5.2 Imports libraries

The main difference between Proactive kernel and a native language one resides in the memory access during the execution of the different blocs. While in a common native language kernel the whole script code (all the notebook blocs) is executed locally in the same shared memory zone, the ProActive kernel each created task will be executed as an independent process. To facilitate the transition from native language kernels to the ProActive one we included the pragma #%import(). This pragma gives the ability to the user to add, just once in the notebook, libraries that are common to all created tasks implemented in a same native script language.

The pragma is used in this general manner: #%import([language=SCRIPT_LANGUAGE]). Example:

#%import(language=Python)
import os
import pandas

Notice that if the language is not specified, Python is considered by default.

5.3 Adding a fork environment

To configure a fork environment for a task, please use the #%fork_env() pragma. A first way to do this is by providing the name of the corresponding task, and the fork environment implementation after that:

#%fork_env(name=TASK_NAME)
containerName = 'activeeon/dlm3'
dockerRunCommand =  'docker run '
dockerParameters = '--rm '
paHomeHost = variables.get("PA_SCHEDULER_HOME")
paHomeContainer = variables.get("PA_SCHEDULER_HOME")
proActiveHomeVolume = '-v '+paHomeHost +':'+paHomeContainer+' '
workspaceHost = localspace
workspaceContainer = localspace
workspaceVolume = '-v '+localspace +':'+localspace+' '
containerWorkingDirectory = '-w '+workspaceContainer+' '
preJavaHomeCmd = dockerRunCommand + dockerParameters + proActiveHomeVolume + workspaceVolume + containerWorkingDirectory + containerName

A second way is by providing the name of the task, and the path of a _.py_ file containing the fork environment code:

#%fork_env(name=TASK_NAME, path=PATH_TO/FORK_ENV_FILE.py)

5.4 Adding a selection script

To add a selection script to a task, please use the #%selection_script() pragma. A first way to do it, provide the name of the corresponding task, and the selection code implementation after that:

#%selection_script(name=TASK_NAME)
selected = True

A second way is by providing the name of the task, and the path of a .py file containing the selection code:

#%selection_script(name=TASK_NAME, path=PATH_TO/SELECTION_CODE_FILE.py)

5.5 Adding job fork environment and/or selection script

If the selection scripts and/or the fork environments are the same for all job tasks, we can add them just once using the job_selection_script and/or the job_fork_env pragmas.

Usage:

For a job selection script please use:

#%job_selection_script([language=SCRIPT_LANGUAGE], [path=./SELECTION_CODE_FILE.py], [force=on/off])

For a job fork environment please use:

#%job_fork_env([language=SCRIPT_LANGUAGE], [path=./FORK_ENV_FILE.py], [force=on/off])

The force parameter says if the pragma has to overwrite the task selection scripts or fork environment already set or not.

5.6 Adding pre and/or post scripts

Sometimes, specified scripts has to be executed before and/or after a particular task. To do that, the solution provides pre_script and post_script pragmas.

To add a pre-script to a task, please use:

#%pre_script(name=TASK_NAME, language=SCRIPT_LANGUAGE, [path=./PRE_SCRIPT_FILE.py])

To add a post-script to a task, please use:

#%post_script(name=TASK_NAME, language=SCRIPT_LANGUAGE, [path=./POST_SCRIPT_FILE.py])

5.7 Create a job

To create a job, please use the #%job() pragma:

#%job(name=JOB_NAME)

If the job was already been created, the call of this pragma would just rename the job already created by the new provided name.

Notice that it is not necessary to create and name explicitly the job. If not done by the user, this step is implicitly performed when the job is submitted (check section 5.7 for more information).

5.8 Plot job

To verify the created workflow, please use the #%draw_job() pragma to plot it into a separate window:

#%draw_job()

Two optional parameters can be used to configure the way the kernel plot the workflow.

inline plotting:

If this parameter is set to 'off', the workflow plotting is made through the Matplotlib external window. The default value is 'on'.

#%draw_job(inline=off)

saving into hard disk:

To be sure the workflow is saved into a _.png_ file, this option needs to be set to on. The default value is off.

#%draw_job(save=on)

Note that the job will be named (in the order of existence) by the name provided using the 'name' parameter, by the name of the job if it is created, by the name of the notebook if reachable or at worst by "Unnamed_job".

#%draw_job([name=JOB_NAME], [inline=off], [save=on])

5.9 Save workflow in dot format

To save the created workflow into a GraphViz _.dot_ format, please use the #%write_dot() pragma:

#%write_dot(name=FILE_NAME)

5.10 Submit your job to the scheduler

To submit the job to the proactive scheduler, the user has to use the #%submit_job() pragma:

#%submit_job()

If the job is not created, or is not up-to-date, the #%submit_job() starts by creating a new job named as the old one. To provide a new name, please use the same pragma and provide a name as parameter:

#%submit_job(name=JOB_NAME)

If the kernel, during its execution, never received a job name, he uses the current notebook name, if possible, or gives a random one.

5.11 List all submitted jobs

To get all the submitted job ids and names, please use list_submitted_jobs pragma this way:

#%list_submitted_jobs()

5.12 Printing results

To get the job result(s), the user has to use the #%get_result() pragma by providing the job name:

#%get_result(name=JOB_NAME)

or by the job id:

#%get_result(id=JOB_ID)

The returned values of your final tasks will be automatically printed.

5.13 Showing ActiveEon portals

Finally, to have the hand on more parameters and features, the user should use ActiveEon Studio portals. The main ones are the _Resource Manager_, the _Scheduling Portal_ and the _Workflow Automation_.

To show the resource manager portal related to the host you are connected to, just run:

#%show_resource_manager([height=HEIGHT_VALUE, width=WIDTH_VALUE])

for the related scheduling portal:

#%show_scheduling_portal([height=HEIGHT_VALUE, width=WIDTH_VALUE])

and for the related workflow automation:

#%show_workflow_automation([height=HEIGHT_VALUE, width=WIDTH_VALUE])

NOTE: The parameters height and width allow the user to adjust the size of the window inside the notebook.

Current status

Features:

  • help: prints all different pragmas/features of the kernel

  • connect: connects to an ActiveEon server (OPTION: connection using a configuration file)

  • import: import specified libraries to all tasks of a same script language

  • task: creates a task

  • pre_script: sets the pre-script of a task

  • post_script: sets the post-script of a task

  • selection_script: sets the selection script of a task

  • job_selection_script: sets the default selection script of a job

  • fork_env: sets the fork environment script

  • job_fork_env: sets the default fork environment of a job

  • job: creates/renames the job

  • draw_job: plot the workflow

  • write_dot: writes the workflow in .dot format

  • submit_job: submits the job to the scheduler

  • get_result: gets and prints the job results

  • list_submitted_jobs: gets and prints the ids and names of the submitted jobs

  • show_resource_manager: opens the ActiveEon resource manager portal

  • show_scheduling_portal: opens the ActiveEon scheduling portal

TODO

Features improvements
  • execute in local a pragma free block
  • add options import_as_json/export_as_json
  • add draw(on/off), print_result(on/off) options in submit job pragma
  • multiple pragmas in a block handling
  • apply selection_script and fork_env to a list of names (tasks)
  • add auto-complete
Documentation
  • add some examples pictures

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
proactive_jupyter_kernel-0.1.1906122211-py3-none-any.whl (39.4 kB) Copy SHA256 hash SHA256 Wheel py3
proactive-jupyter-kernel-0.1.1906122211.zip (51.9 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page