A collection of tools for using inside Jupyter Notebooks
Project description
NBTools
Collection of tools for monitoring running Jupyter Notebooks and interacting with them.
The installation should be as easy as:
pip install py-nbtools
NBstat
The main tool of this package is nbstat / nbwatch command line utility. It is added at installation and shows the detailed resource utilization for each process of each running Jupyter Notebook. A gif is worth a thousand words:
For more information, check out the full user documentation: explanation of different table views, command line options and ready-to-use snippets.
Troubleshooting: PID namespaces, user permissions and zombie processes
A known problem of NVIDIA drivers is that nvidia-smi reports PIDs of processes on devices in the global namespace, not in the container namespace, which does not allow to match PIDs of container processes to their device PIDs. There are a few workarounds:
- [recommended] pass
--pid=host
flag todocker run
. - patch NVIDIA driver to handle PID namespaces correctly.
- [Linux only] fallback on manually inspecting /proc/PID/ files to identify the host PID for each process inside of the container.
While nbstat
provides several fallbacks for Linux
containers (and intend to provide support for more environments over time), the bullet-proof way is to use --pid=host
option for docker run
. Adding it resolves most of the issues immediately.
One more thing that sometimes happens to NVIDIA devices is zombie processes: by incorrectly terminating a GPU-using process you can end up in a situation where device memory is held by not-existing process. As far as I know, there are no ways of killing them without rebooting, and nbstat
just marks such processes with red color.
In order to inspect certain properties of processes, we rely on having all necessary permissions already provided at command run. nbstat
has some fallbacks for some attributes, and I currently work on improving error handling in cases of denied access to files.
Contribute
If you are interested to contribute, check out the developer/contributor page. It contains detailed description about inner workings of the library, my design choices and motivation behind them, as well as discussion of complexities along the way.
Library
Other than nbstat / nbwatch
monitoring utilities, this library provides a few useful tools for working with notebooks and GPUs.
pylint_notebook
Shamelessly taken from pylint page:
Function that checks for errors in Jupyter Notebooks with Python code, tries to enforce a coding standard and looks for code smells. It can also look for certain type errors, it can recommend suggestions about how particular blocks can be refactored and can offer you details about the code's complexity.
Using it as easy as:
from nbtools import pylint_notebook
pylint_notebook(path_to_ipynb, # If not provided, use path to the current notebook
disable='invalid-name', # Disable specified Pylint checks. Can be a list.
enable='import-error') # Enable specified Pylint checks. Can be a list.
Under the hood, it converts .ipynb
notebook to .py
script, creates a custom .pylintrc
configuration, runs the pylint
and removes all temporary files. Learn more about its usage in the tutorial.
exec_notebook
Provides a eval
-like interface for running Jupyter Notebooks programmatically. We use it for running interactive tests, that are easy to work with: in case of any failures, one can jump right into fixing it with an already set-up environment.
from nbtools import exec_notebook
exec_notebook(path_to_ipynb, # Which notebook to run
out_path_ipynb, # Where to save result
inputs={'learning_rate': 0.05,}, # Pass variables to notebook
outputs=['accuracy']) # Extract variables from notebook
set_gpus, free_gpus
Select free device(s) and set CUDA_VISIBLE_DEVICES
environment variable so that the current process sees only them.
Eliminates an enormous amount of bugs and unexpected behaviors related to GPU usage.
from nbtools import set_gpus, free_gpus
used_gpus = set_gpus(n=2, # Number of devices to set.
min_free_memory=0.7,# Minimum amount of free memory on device to consider free.
max_processes=3) # Maximum amount of processes on device to consider free.
free_gpus(used_gpus) # Kill all processes on selected GPUs. Useful at teardown.
Other functions
from nbtools import (in_notebook, # Return True if executed inside of Jupyter Notebook
get_notebook_path, # If executed in Jupyter Notebook, return its absolute path
get_notebook_name, # If executed in Jupyter Notebook, return its name
notebook_to_script) # Convert Jupyter Notebook to an executable Python script.
# Works well with magics and command line executions.
Goals
This library started as a container of tools, that I came across / developed in my years as an ML researcher. As some of the functions survived multiple refactoring iterations, I decided to share the library so it is easier to perfect them and test in different environments.
Another goal of the project is to show how to communicate with Jupyter API on real world examples: instead of reading through a number of stackoverflow threads, you can find the same information collected in one place and get a rough understanding of what is possible with it and what is not.
Acknowledgements
The nbstat module builds on gpustat package. Using the gpustat for years gave me an idea about possible improvements, which are implemented in this library. While the implementation is different, reading through the code of gpustat was essential for development.
Animated GIFs are created by using Terminalizer: aside from the usual problems with installation, the tool itself is amazing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file py_nbtools-0.9.14.tar.gz
.
File metadata
- Download URL: py_nbtools-0.9.14.tar.gz
- Upload date:
- Size: 48.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 513c71484ef96a6dbdcd2694e935cd15eec95cca34c6b8746bf6655673b8b8e4 |
|
MD5 | 8a7c51fe9692fe9153eeb94d4514b1fe |
|
BLAKE2b-256 | e2a8276f0fc23eb42b4c371eae23639f2de8033ba11e80885bc4fd1d983a2ac3 |
File details
Details for the file py_nbtools-0.9.14-py3-none-any.whl
.
File metadata
- Download URL: py_nbtools-0.9.14-py3-none-any.whl
- Upload date:
- Size: 51.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86e91316d72a32ed02281066846bacf2fbf457250563bd3eec77ec8a53f58384 |
|
MD5 | bdb0a587c1d0d775bec2a07b3e38b68f |
|
BLAKE2b-256 | 315ff3a001651061c341924d9d3287d9c12494e67542b8b973d2027690453b52 |