This is the Score-P Python Kernel that enables you to execute Python code in Jupyter Notebooks with Score-P. The kernel is based on the Score-P Python bindings.
Project description
The Score-P Python Jupyter Kernel
This is the Score-P Python Kernel that enables you to execute Python code in Jupyter Notebooks with Score-P.
The kernel uses the Score-P Python bindings.
Table of Content
- Installation
- Usage
- Presentation of Performance Data
- Limitations
- Future Work
- Citing
- Contact
- Acknowledgments
Installation
For using the kernel you need a proper Score-P installation.
From the Score-P Python bindings:
You need at least Score-P 5.0, build with
--enable-shared
and the gcc compiler plugin. Please make sure thatscorep-config
is in yourPATH
variable. For Ubuntu LTS systems there is a non-official ppa of Score-P available: https://launchpad.net/~andreasgocht/+archive/ubuntu/scorep .
To install the kernel and all dependencies, including the Python bindings use:
pip install scorep-jupyter
python -m scorep_jupyter.install
You can also build the kernel from source via:
pip install .
The kernel will then be installed in your active python environment.
You can select the kernel in Jupyter as scorep-python
.
Usage
Configuring Score-P in Jupyter
%%scorep_env
Set up your Score-P environment. For a documentation of Score-P environment variables, see: Score-P Measurement Configuration.
%%scorep_python_binding_arguments
Set the Score-P Python bindings arguments. For a documentation of arguments, see Score-P Python bindings.
Executing Cells
Without Score-P
You can execute cells without Score-P as usual:
With Score-P
%%execute_with_scorep
Executes a cell with Score-P, i.e. it calls python -m scorep <cell code>
Multi-Cell Mode
You can also treat multiple cells as one single cell by using the multi cell mode. Therefore you can mark the cells in the order you wish to execute them.
%%enable_multicellmode
Enables the multi-cell mode and starts the marking process. Subsequently, "running" cells will not execute them but mark them for execution after %%finalize_multicellmode
.
%%finalize_multicellmode
Stop the marking process and executes all the marked cells. All the marked cells will be executed with Score-P.
%%abort_multicellmode
Stops the marking process, without executing the cells.
Hints:
-
The
%%execute_with_scorep
command has no effect in the multi cell mode. -
There is no "unmark" command available but you can abort the multicellmode by the
%%abort_multicellmode
command. Start your marking process again if you have marked your cells in the wrong order. -
The
%%enable_multicellmode
,%%finalize_multicellmode
and%%abort_multicellmode
commands should be run in an exclusive cell. Additional code in the cell will be ignored.
Write Mode
Analogous to %%writefile command in IPykernel, you can convert a set of cells to the Python script which is to be executed with Score-P Python bindings (with settings and environment described in auxillary bash script).
%%start_writefile [scriptname]
Enables the write mode and starts the marking process. Subsequently, "running" cells will not execute them but mark them for writing into a python file after %%end_writefile
.
scriptname
is jupyter_to_script.py
by default.
%%end_writefile
Stops the marking process and writes the marked cells in a Python script. Additionally, a bash script will be created for setting the Score-P environment variables, Pyhton bindings arguments and executing the Python script.
Hints:
-
Recording a cell containing
%%scorep_env
or%%scorep_python_binding_arguments
will add the environment variables/Score-P Python bindings to the bash script. -
Code of a cell which is not to be executed with Score-P (not inside the multicell mode and without
%%execute_with_scorep
) will be framed withwith scorep.instrumenter.disable()
in the Python script to prevent instrumentation. -
Other cells will be recorded without any changes, except for dropping all magic commands.
-
%%abort_multicellmode
will be ignored in the write mode and will not unmark previous cells from instrumentation.
Presentation of Performance Data
To inspect the collected performance data, use tools as Vampir (Trace) or Cube (Profile).
Limitations
Serialization Type Support
For the execution of a cell, the kernel uses the default IPython kernel. For a cell with Score-P it starts a new Python process. Before starting this process, the state of the previous executed cells is persisted using dill
library (https://github.com/uqfoundation/dill). However:
dill
cannot yet pickle these standard types: frame, generator, traceback
Overhead
When dealing with big data structures, there might be a big runtime overhead at the beginning and the end of a Score-P cell. This is due to additional data saving and loading processes for persistency in the background. However this does not affect the actual user code and the Score-P measurements.
Future Work
The kernel is still under development. The following is on the agenda:
- Check alternative Python implementations (Stackless/PyPy) for better serialization support
- Performance data visualizations
- Overhead reduction
PRs are welcome.
Citing
If you publish some work using the kernel, we would appreciate if you cite the following paper:
Werner, E., Manjunath, L., Frenzel, J., & Torge, S. (2021, October).
Bridging between Data Science and Performance Analysis: Tracing of Jupyter Notebooks.
In The First International Conference on AI-ML-Systems (pp. 1-7).
https://dl.acm.org/doi/abs/10.1145/3486001.3486249
Additionally, please refer to the Score-P Python bindings, published here:
Gocht A., Schöne R., Frenzel J. (2021)
Advanced Python Performance Monitoring with Score-P.
In: Mix H., Niethammer C., Zhou H., Nagel W.E., Resch M.M. (eds) Tools for High Performance Computing 2018 / 2019. Springer, Cham.
https://doi.org/10.1007/978-3-030-66057-4_14
or
Gocht-Zech A., Grund A. and Schöne R. (2021)
Controlling the Runtime Overhead of Python Monitoring with Selective Instrumentation
In: 2021 IEEE/ACM International Workshop on Programming and Performance Visualization Tools (ProTools)
https://doi.org/10.1109/ProTools54808.2021.00008
Contact
Acknowledgments
This work was supported by the German Federal Ministry of Education and Research (BMBF, SCADS22B) and the Saxon State Ministry for Science, Culture and Tourism (SMWK) by funding the competence center for Big Data and AI "ScaDS.AI Dresden/Leipzig
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file scorep-jupyter-0.6.0.tar.gz
.
File metadata
- Download URL: scorep-jupyter-0.6.0.tar.gz
- Upload date:
- Size: 17.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 70d6f5d3151fc0f7e4b09a5168611fd23a6e2953f8251bfec321feb1344d3035 |
|
MD5 | 7effac946934cbc245b5470db808f785 |
|
BLAKE2b-256 | ef2b31da956b8b2b444b2f60ab3cea3473e3d3d1d9d1cc6210fef1169c50f9b3 |
File details
Details for the file scorep_jupyter-0.6.0-py3-none-any.whl
.
File metadata
- Download URL: scorep_jupyter-0.6.0-py3-none-any.whl
- Upload date:
- Size: 13.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb1342e15c7762d21a90260ee2b23837645233954c735d02a681a2fdc4e369b5 |
|
MD5 | ffa4a664d69082aa3483cb0464dcbc93 |
|
BLAKE2b-256 | a35d8efd462eafc9aa8d35862382cabb7755d42a6d3c883b0406f832168f1455 |