Skip to main content

Map Reduce for Notebooks

Project description

Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

The goals for Papermill are:

  • Parametrizing notebooks

  • Executing and collecting metrics across the notebooks

  • Summarizing collections of notebooks

Installation

pip install papermill

Usage

Parameterizing a notebook.

### template.ipynb
# This cell has a "preface" tag. These values will be overwritten by Papermill.
alpha = 0.5
ratio = 0.1

Recording values to be saved with the notebook.

### template.ipynb
import random
import papermill as pm

rand_value = random.randint(1, 10)
pm.record("random_value", rand_value)
pm.record("foo", "bar")

Displaying outputs to be saved with the notebook.

### template.ipynb
# Import plt and turn off interactive plotting to avoid double plotting.
import papermill as pm
import matplotlib.pyplot as plt; plt.ioff()
from ggplot import mpg

f = plt.figure()
plt.hist('cty', bins=12, data=mpg)
pm.display('matplotlib_hist', f)

Executing a parameterized Jupyter notebook

import papermill as pm

pm.execute_notebook(
    notebook="template.ipynb",
    output="output.ipynb",
    params=dict(alpha=0.1, ratio=0.001)
)

Analyzing a single notebook

### summary.ipynb
import papermill as pm

nb = pm.read_notebook('output.ipynb')
nb.dataframe.head()

# Show named plot from 'output.ipynb'
nb.display_output('matplotlib_hist')

Analyzing a collection of notebooks

### summary.ipynb
import papermill as pm

nbs = pm.read_notebooks('/path/to/results/')

# Show named plot from 'output1.ipynb'
nbs.display_output('output1.ipynb', 'matplotlib_hist')

# Dataframe for all notebooks in collection
df = nbs.dataframe
df.head()

# Show histograms from notebooks with the highest random value.
pivoted_df = df.pivot('key', 'name', 'value').sort_values(by='name')
pivoted_df.head()

nbs.display_output(pivoted_df[:3], 'matplotlib_hist')

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

papermill-0.5.1.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

papermill-0.5.1-py2-none-any.whl (10.4 kB view details)

Uploaded Python 2

File details

Details for the file papermill-0.5.1.tar.gz.

File metadata

  • Download URL: papermill-0.5.1.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for papermill-0.5.1.tar.gz
Algorithm Hash digest
SHA256 393de11269015c3c674849662d034a5f553f6dfefb9143c2e98d20b1276b7d5c
MD5 64e42261706f3ae9c7b398072b361ed0
BLAKE2b-256 ffe9e96eb0bf593a7728edac66b7a1efb9d69802355f1d8e78eda4ec5238d386

See more details on using hashes here.

File details

Details for the file papermill-0.5.1-py2-none-any.whl.

File metadata

File hashes

Hashes for papermill-0.5.1-py2-none-any.whl
Algorithm Hash digest
SHA256 73f9311a5bb398e07c7989c39ec246a513c81a48a8083077c52279fd85e13dec
MD5 49827b51557ede00141734bc88867f1e
BLAKE2b-256 1a738bbf95610871cbeaecef9d0e0d3ed1ae08c96a6e54483d4b6960bc2ad90b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page