Skip to main content

Map Reduce for Notebooks

Project description

Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

The goals for Papermill are:

  • Parametrizing notebooks

  • Executing and collecting metrics across the notebooks

  • Summarizing collections of notebooks

Installation

pip install papermill

Usage

Parameterizing a notebook.

### template.ipynb
# This cell has a "parameters" tag. These values will be overwritten by Papermill.
alpha = 0.5
ratio = 0.1

Recording values to be saved with the notebook.

### template.ipynb
import random
import papermill as pm

rand_value = random.randint(1, 10)
pm.record("random_value", rand_value)
pm.record("foo", "bar")

Displaying outputs to be saved with the notebook.

### template.ipynb
# Import plt and turn off interactive plotting to avoid double plotting.
import papermill as pm
import matplotlib.pyplot as plt; plt.ioff()
from ggplot import mpg

f = plt.figure()
plt.hist('cty', bins=12, data=mpg)
pm.display('matplotlib_hist', f)

Executing a parameterized Jupyter notebook

import papermill as pm

pm.execute_notebook(
    notebook="template.ipynb",
    output="output.ipynb",
    params=dict(alpha=0.1, ratio=0.001)
)

Analyzing a single notebook

### summary.ipynb
import papermill as pm

nb = pm.read_notebook('output.ipynb')
nb.dataframe.head()

# Show named plot from 'output.ipynb'
nb.display_output('matplotlib_hist')

Analyzing a collection of notebooks

### summary.ipynb
import papermill as pm

nbs = pm.read_notebooks('/path/to/results/')

# Show named plot from 'output1.ipynb'
nbs.display_output('output1.ipynb', 'matplotlib_hist')

# Dataframe for all notebooks in collection
df = nbs.dataframe
df.head()

# Show histograms from notebooks with the highest random value.
pivoted_df = df.pivot('key', 'name', 'value').sort_values(by='name')
pivoted_df.head()

nbs.display_output(pivoted_df[:3], 'matplotlib_hist')

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

papermill-0.6.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

papermill-0.6-py2-none-any.whl (10.3 kB view details)

Uploaded Python 2

File details

Details for the file papermill-0.6.tar.gz.

File metadata

  • Download URL: papermill-0.6.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for papermill-0.6.tar.gz
Algorithm Hash digest
SHA256 971e5e701d71ff2296423cb66f54b5c10d9effda75cf27e651c4121491de0780
MD5 da5a6679ba0aae2d476808afe96f54a2
BLAKE2b-256 b000ca74d7b2060783ab17fd204e709c44787a953917bdbc6b390d2f45abdc4b

See more details on using hashes here.

File details

Details for the file papermill-0.6-py2-none-any.whl.

File metadata

File hashes

Hashes for papermill-0.6-py2-none-any.whl
Algorithm Hash digest
SHA256 47a943d641dd6c5bee8aa53a9d1ff5260f6893a9acb5d5e97b8b1d5c7a3b0b33
MD5 24318808a113d9f0402cd15b9930c7a7
BLAKE2b-256 e0e8540ed5c0f4f68dbcc01b0f35d38854af44934e675db7fb2515299cc23094

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page