Skip to main content

A set of tools for nb handling

Project description

CI PyPI

nb_helpers

A simple tool to clean, test and fix notebooks for your repo

Install

You can install from pypi:

pip install nb_helpers

or get latest:

pip install -e .

Usage

This little library gives you command line tools to clean, test and check your jupyter notebooks.

  • Clean: When you call clean_nbs it will strip notebooks from the metadata, this helps prevent git conflicts. You can also pass the flag --clear_outs and also remove cell outputs.
$ nb_helpers.clean_nbs --help                                                                                                                                   tcapelle at MBP14.local (-)(main)
usage: nb_helpers.clean_nbs [-h] [--path PATH] [--clear_outs] [--verbose]

Clean notebooks on `path` from useless metadata

options:
  -h, --help    show this help message and exit
  --path PATH   The path to notebooks (default: .)
  --clear_outs  Remove cell outputs (default: False)
  --verbose     Rnun on verbose mdoe (default: False)

You can run this comman on this repo:

$ nb_helpers.clean_nbs
> 
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓                                                                      Notebook Path                                    Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ tests/data/dummy_folder/fail_nb.ipynb            Ok✔    │
│ tests/data/dummy_folder/test_nb2.ipynb           Ok✔    │
│ tests/data/dummy_folder/test_nb_all_slow.ipynb   Ok✔    │
│ tests/data/dummy_folder/test_nb_some_slow.ipynb  Ok✔    │
│ tests/data/features_nb.ipynb                     Ok✔    │
│ tests/data/test_nb.ipynb                         Ok✔    │
└─────────────────────────────────────────────────┴────────┘
  • Run: One can run the notebooks in path and get info about the execution.
$ nb_helpers.run_nbs --help                                                                                                                                tcapelle at MBP14 (--)(main)
usage: nb_helpers.run_nbs [-h] [--verbose] [--lib_name LIB_NAME] [--no_run] [--pip_install] [--github_issue] [--repo REPO] [--owner OWNER] [path]

positional arguments:
  path                 A path to nb files (default: /Users/tcapelle/wandb/nb_helpers)

options:
  -h, --help           show this help message and exit
  --verbose            Print errors along the way (default: False)
  --lib_name LIB_NAME  Python lib names to filter, eg: tensorflow
  --no_run             Do not run any notebook (default: False)
  --pip_install        Run cells with !pip install (default: False)
  --github_issue       Create a github issue if notebook fails (default: False)
  --repo REPO          Github repo to create issue in (default: nb_helpers)
  --owner OWNER        Github owner to create issue in (default: wandb)

You can now post github issues when running fails, the cool thing is that it can be posted to another repo than the one from the notebooks. Just pass the --repo name and the --owner (for example wandb/other_cool_repo)

You get the following output inside this repo:

$ nb_helpers.run_nbs
CONSOLE.is_terminal(): True
Writing output to run.csv
Notebook Path Status Run Time colab
dev_nbs/search.ipynb Fail 1 s open
tests/data/dummy_folder/fail_nb.ipynb Fail 1 s open
tests/data/dummy_folder/test_nb2.ipynb Ok 0 s open
tests/data/dummy_folder/test_nb_all_slow.ipynb Skipped 0 s open
tests/data/dummy_folder/test_nb_some_slow.ipynb Ok 0 s open
tests/data/features_nb.ipynb Ok 0 s open
tests/data/test_nb.ipynb Ok 0 s open
  • Summary: You can get a summary of the notebooks in your project with the nb_helpers.summary_nbs function.
$ nb_helpers.summary_nbs
CONSOLE.is_terminal(): True
Writing output to /Users/tcapelle/wandb/nb_helpers/logs/summary.csv
Reading 6 notebooks
┌───┬─────────────────────────────────────────────────┬────────────┬────────────────┬────────────────────────────────────────────────┬────────────┬───────┐
│ # │ nb name                                         │ tracker    │ wandb features │ python libs                                    │ colab_cell │ colab │
├───┼─────────────────────────────────────────────────┼────────────┼────────────────┼────────────────────────────────────────────────┼────────────┼───────┤
│ 1  tests/data/dummy_folder/fail_nb.ipynb                                                                                                    open  │
│ 2  tests/data/dummy_folder/test_nb2.ipynb                                                                                                   open  │
│ 3  tests/data/dummy_folder/test_nb_all_slow.ipynb                               time                                                        open  │
│ 4  tests/data/dummy_folder/test_nb_some_slow.ipynb                              time                                                        open  │
│ 5  tests/data/features_nb.ipynb                                                 typing, itertools                                           open  │
│ 6  tests/data/test_nb.ipynb                         0: tracker                  os, sys, logging, pathlib, fastcore, itertools  1           open  │
└───┴─────────────────────────────────────────────────┴────────────┴────────────────┴────────────────────────────────────────────────┴────────────┴───────┘

Python Lib

All this functions can also be used inside python:

from pathlib import Path
from nb_helpers.run import run_nbs

examples_path = Path("examples/colabs")

errors = run_nbs(path=examples_path, verbose=True, timeout=600)

Also the library has many little functions to make your life easier inside the repo you are orchestrating:

from pathlib import Path
from nb_helpers.utils import *
from nb_helpers.colab import *

examples_path = Path("examples/colabs")

# get all nbs in the folder recursevely, filters hidden, non nb stuff
nb_files = find_nbs(example_path)

one_nb_path = nb_files[0]
notebook = read_nb(one_nb_path)

# get all libs imported
libs = detect_imported_libs(notebook)

# get remote github repo
github_repo = git_origin_repo(one_nb_path)

# detect if master is called main or master
master_name = git_main_name(one_nb_path)

# get colab link
colab_url = get_colab_url(one_nb_path, branch) 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nb_helpers-0.6.2.tar.gz (20.5 kB view hashes)

Uploaded Source

Built Distribution

nb_helpers-0.6.2-py3-none-any.whl (21.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page