Skip to main content

Map Reduce for Notebooks

Project description

Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

The goals for Papermill are:

  • Parametrizing notebooks
  • Executing and collecting metrics across the notebooks
  • Summarizing collections of notebooks


pip install papermill


Parameterizing a Notebook.

To parameterize your notebook designate a cell with the tag parameters. Papermill looks for the parameters cell and replaces those values with the parameters passed in at execution time.


Executing a Notebook

The two ways to execute the notebook with parameters are through the Python API and through the command line interface.

Executing a Notebook via Python API

import papermill as pm

   parameters=dict(alpha=0.6, ratio=0.1)

Executing a Notebook via CLI

$ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1

Recording Values to the Notebook

Users can save values to the notebook document to be consumed by other notebooks.

Recording values to be saved with the notebook.

### notebook.ipynb
import papermill as pm

pm.record("hello", "world")
pm.record("number", 123)
pm.record("some_list", [1,3,5])
pm.record("some_dict", {"a":1, "b":2})

Users can recover those values as a Pandas dataframe via the the read_notebook function.

### summary.ipynb
import papermill as pm

nb = pm.read_notebook('notebook.ipynb')

Displaying Plots and Images Saved by Other Notebooks

Display a matplotlib histogram with the key name “matplotlib_hist”.

### notebook.ipynb
# Import plt and turn off interactive plotting to avoid double plotting.
import papermill as pm
import matplotlib.pyplot as plt; plt.ioff()
from ggplot import mpg

f = plt.figure()
plt.hist('cty', bins=12, data=mpg)
pm.display('matplotlib_hist', f)

Read in that above notebook and display the plot saved at “matplotlib_hist”.

### summary.ipynb
import papermill as pm

nb = pm.read_notebook('notebook.ipynb')

Analyzing a Collection of Notebooks

Papermill can read in a directory of notebooks and provides the NotebookCollection interface for operating on them.

### summary.ipynb
import papermill as pm

nbs = pm.read_notebooks('/path/to/results/')

# Show named plot from 'notebook1.ipynb'
# Accepts a key or list of keys to plot in order.
nbs.display_output('train_1.ipynb', 'matplotlib_hist')
# Dataframe for all notebooks in collection

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
papermill-0.6.1-py2-none-any.whl (11.0 kB) Copy SHA256 hash SHA256 Wheel py2 Jul 28, 2017
papermill-0.6.1.tar.gz (23.9 kB) Copy SHA256 hash SHA256 Source None Jul 28, 2017

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page