Skip to main content

Map Reduce for Notebooks

Project description

Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

The goals for Papermill are:

  • Parametrizing notebooks

  • Executing and collecting metrics across the notebooks

  • Summarizing collections of notebooks

Installation

pip install papermill

In-Notebook bindings

Usage

Parameterizing a Notebook.

To parameterize your notebook designate a cell with the tag parameters. Papermill looks for the parameters cell and replaces those values with the parameters passed in at execution time.

docs/img/parameters.png

Executing a Notebook

The two ways to execute the notebook with parameters are through the Python API and through the command line interface.

Executing a Notebook via Python API

import papermill as pm

pm.execute_notebook(
   notebook_path='path/to/input.ipynb',
   output_path='path/to/output.ipynb',
   parameters=dict(alpha=0.6, ratio=0.1)
)

Executing a Notebook via CLI

$ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1

Recording Values to the Notebook

Users can save values to the notebook document to be consumed by other notebooks.

Recording values to be saved with the notebook.

### notebook.ipynb
import papermill as pm

pm.record("hello", "world")
pm.record("number", 123)
pm.record("some_list", [1,3,5])
pm.record("some_dict", {"a":1, "b":2})

Users can recover those values as a Pandas dataframe via the the read_notebook function.

### summary.ipynb
import papermill as pm

nb = pm.read_notebook('notebook.ipynb')
nb.dataframe
docs/img/nb_dataframe.png

Displaying Plots and Images Saved by Other Notebooks

Display a matplotlib histogram with the key name “matplotlib_hist”.

### notebook.ipynb
# Import plt and turn off interactive plotting to avoid double plotting.
import papermill as pm
import matplotlib.pyplot as plt; plt.ioff()
from ggplot import mpg

f = plt.figure()
plt.hist('cty', bins=12, data=mpg)
pm.display('matplotlib_hist', f)
docs/img/matplotlib_hist.png

Read in that above notebook and display the plot saved at “matplotlib_hist”.

### summary.ipynb
import papermill as pm

nb = pm.read_notebook('notebook.ipynb')
nb.display_output('matplotlib_hist')
docs/img/matplotlib_hist.png

Analyzing a Collection of Notebooks

Papermill can read in a directory of notebooks and provides the NotebookCollection interface for operating on them.

### summary.ipynb
import papermill as pm

nbs = pm.read_notebooks('/path/to/results/')

# Show named plot from 'notebook1.ipynb'
# Accepts a key or list of keys to plot in order.
nbs.display_output('train_1.ipynb', 'matplotlib_hist')
docs/img/matplotlib_hist.png
# Dataframe for all notebooks in collection
nbs.dataframe.head(10)
docs/img/nbs_dataframe.png

Project details


Release history Release notifications | RSS feed

This version

0.8.5

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

papermill-0.8.5.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

papermill-0.8.5-py2-none-any.whl (19.3 kB view details)

Uploaded Python 2

File details

Details for the file papermill-0.8.5.tar.gz.

File metadata

  • Download URL: papermill-0.8.5.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for papermill-0.8.5.tar.gz
Algorithm Hash digest
SHA256 206bb826cdde348b952912b2951a524e6f44b6dd7e14b41790076b10cc9bbf23
MD5 53b58018afec49a145f24810c604368c
BLAKE2b-256 dc338eb40b76c71c7d2bdfe9b11821420a5e07b7b194058c2f4107a8a095612f

See more details on using hashes here.

File details

Details for the file papermill-0.8.5-py2-none-any.whl.

File metadata

File hashes

Hashes for papermill-0.8.5-py2-none-any.whl
Algorithm Hash digest
SHA256 e23e7187d192bc6c589555fa83f03364a46e315adab3c4be7e9ea72ead16e6f7
MD5 950d2b9210eb909edc50cc33cd2b6287
BLAKE2b-256 2fd7bdbd575e0a2b066e0b5ce6143442ac2e20a6ae71d9a90bc21b12cc865f74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page