Skip to main content

Map Reduce for Notebooks

Project description

Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

The goals for Papermill are:

  • Parametrizing notebooks

  • Executing and collecting metrics across the notebooks

  • Summarizing collections of notebooks

Installation

pip install papermill

Usage

Parameterizing a notebook.

### template.ipynb
# This cell has a "preface" tag. These values will be overwritten by Papermill.
alpha = 0.5
ratio = 0.1

Recording values to be saved with the notebook.

### template.ipynb
import random
import papermill as pm

rand_value = random.randint(1, 10)
pm.record("random_value", rand_value)
pm.record("foo", "bar")

Displaying outputs to be saved with the notebook.

### template.ipynb
# Import plt and turn off interactive plotting to avoid double plotting.
import papermill as pm
import matplotlib.pyplot as plt; plt.ioff()
from ggplot import mpg

f = plt.figure()
plt.hist('cty', bins=12, data=mpg)
pm.display('matplotlib_hist', f)

Executing a parameterized Jupyter notebook

import papermill as pm

pm.execute_notebook(
    notebook="template.ipynb",
    output="output.ipynb",
    params=dict(alpha=0.1, ratio=0.001)
)

Analyzing a single notebook

### summary.ipynb
from papermill import Notebook

nb = Notebook.read('output.ipynb')
nb.dataframe.head()

# Show named plot from 'output.ipynb'
nb.display_output('matplotlib_hist')

Analyzing a collection of notebooks

### summary.ipynb
from papermill import NotebookCollection

nbs = NotebookCollection.from_directory('/path/to/results/')

# Show named plot from 'output1.ipynb'
nbs.display_output('output1.ipynb', 'matplotlib_hist')

# Dataframe for all notebooks in collection
df = nbs.dataframe
df.head()

# Show histograms from notebooks with the highest random value.
pivoted_df = df.pivot('key', 'name', 'value').sort_values(by='name')
pivoted_df.head()

nbs.display_output(pivoted_df[:3], 'matplotlib_hist')

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

papermill-0.5.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

papermill-0.5-py2-none-any.whl (10.3 kB view details)

Uploaded Python 2

File details

Details for the file papermill-0.5.tar.gz.

File metadata

  • Download URL: papermill-0.5.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for papermill-0.5.tar.gz
Algorithm Hash digest
SHA256 5629e18ff9fc719d6378f18fc23fcbb48631facea852b2f897db3122dd399fa6
MD5 ec138b0cb21e4fa98d6471fe4714a160
BLAKE2b-256 6c3c1e455534dc8346f79b2e54ff4197a8e8608f1b95e48f765612025d53b97d

See more details on using hashes here.

File details

Details for the file papermill-0.5-py2-none-any.whl.

File metadata

File hashes

Hashes for papermill-0.5-py2-none-any.whl
Algorithm Hash digest
SHA256 89ceb73ebbc170cf6ff52af77e94509061e33af38d87fa3b80aff2a4c2d67aa7
MD5 159355482b623899fc82be5eb9594421
BLAKE2b-256 13e13de1a2a49457bea309bc629c8ade54115858ccbecf2f66f31b41fe43b1fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page