Skip to main content

Clean Jupyter notebooks for versioning

Project description

Licence GitHub release PyPI version Python versions CI status Coverage

nb-clean cleans Jupyter notebooks of cell execution counts, metadata, outputs, and (optionally) empty cells, preparing them for committing to version control. It provides a Git filter to automatically clean notebooks before they're staged, and can also be used with other version control systems, as a command line tool, and as a Python library. It can determine if a notebook is clean or not, which can be used as a check in your continuous integration pipelines.

:warning: nb-clean 2.0.0 introduced a new command line interface to make cleaning notebooks in place easier. If you upgrade from a previous release, you'll need to migrate to the new interface as described under Migrating to nb-clean 2.

Installation

To install the latest release from PyPI, use pip:

python3 -m pip install nb-clean

Alternately, in Python projects using Poetry or Pipenv for dependency management, add nb-clean as a development dependency with poetry add --dev nb-clean or pipenv install --dev nb-clean. nb-clean requires Python 3.7 or later.

Usage

Cleaning

To add a filter to an existing Git repository to automatically clean notebooks when they're staged, run the following from the working tree:

nb-clean add-filter

This will configure a filter to remove cell execution counts, metadata, and outputs. To also remove empty cells, use:

nb-clean add-filter --remove-empty-cells

To preserve cell metadata, such as that required by tools such as papermill, use:

nb-clean add-filter --preserve-cell-metadata

To preserve cell outputs, use:

nb-clean add-filter --preserve-cell-outputs

nb-clean will configure a filter in the Git repository in which it is run, and won't mutate your global or system Git configuration. To remove the filter, run:

nb-clean remove-filter

Aside from usage from a filter in a Git repository, you can also clean up a Jupyter notebook with:

nb-clean clean notebook.ipynb

This cleans the notebook in place. You can also pass the notebook content on standard input, in which case the cleaned notebook is written to standard output:

nb-clean clean < original.ipynb > cleaned.ipynb

To also remove empty cells, add the -e/--remove-empty-cells flag. To preserve cell metadata, add the -m/--preserve-cell-metadata flag. To preserve cell outputs, add the -o/--preserve-cell-outputs flag.

Checking

You can check if a notebook is clean with:

nb-clean check notebook.ipynb

or by passing the notebook contents on standard input:

nb-clean check < notebook.ipynb

To also check for empty cells, add the -e/--remove-empty-cells flag. To ignore cell metadata, add the -m/--preserve-cell-metadata flag. To ignore cell outputs, add the -o/--preserve-cell-outputs flag.

nb-clean will exit with status code 0 if the notebook is clean, and status code 1 if it is not. nb-clean will also print details of cell execution counts, metadata, outputs, and empty cells it finds.

Migrating to nb-clean 2

The following table maps from the command line interface of nb-clean 1.6.0 to that of nb-clean 2.0.0.

Description nb-clean 1.6.0 nb-clean 2.0.0
Clean notebook nb-clean clean -i/--input notebook.ipynb | sponge notebook.ipynb nb-clean clean notebook.ipynb
Clean notebook (remove empty cells) nb-clean clean -i/--input notebook.ipynb -e/--remove-empty nb-clean clean -e/--remove-empty-cells notebook.ipynb
Clean notebook (preserve cell metadata) nb-clean clean -i/--input notebook.ipynb -m/--preserve-metadata nb-clean clean -m/--preserve-cell-metadata notebook.ipynb
Check notebook nb-clean check -i/--input notebook.ipynb nb-clean check notebook.ipynb
Check notebook (remove empty cells) nb-clean check -i/--input notebook.ipynb -e/--remove-empty nb-clean check -e/--remove-empty-cells notebook.ipynb
Check notebook (preserve cell metadata) nb-clean check -i/--input notebook.ipynb -m/--preserve-metadata nb-clean check -m/--preserve-cell-metadata notebook.ipynb
Add Git filter to clean notebooks nb-clean configure-git nb-clean add-filter
Remove Git filter nb-clean unconfigure-git nb-clean remove-filter

Copyright

Copyright © 2017-2022 Scott Stevenson.

nb-clean is distributed under the terms of the ISC licence.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nb-clean-2.2.0.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nb_clean-2.2.0-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file nb-clean-2.2.0.tar.gz.

File metadata

  • Download URL: nb-clean-2.2.0.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.15 CPython/3.10.4 Linux/5.15.0-46-generic

File hashes

Hashes for nb-clean-2.2.0.tar.gz
Algorithm Hash digest
SHA256 e9be8cb26f87da11142606661b290db32623ce00b33f9b7cac6c41337415e2d8
MD5 0a2b6dfeec68a08275f79fc7d6625732
BLAKE2b-256 38806cde28c501242a33e339f2f54a10193b8e05864ac9b8ea8a687a8354cce9

See more details on using hashes here.

File details

Details for the file nb_clean-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: nb_clean-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.15 CPython/3.10.4 Linux/5.15.0-46-generic

File hashes

Hashes for nb_clean-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d3b34a5eba242e6da711e85d406667344e970d638b1caaecee9a58bb9fe9c076
MD5 9fdb5d3c580e0121d9cfb5c3ff44c413
BLAKE2b-256 8a36ec8d3210f76e6cc7ff8f6f9f5aa1ab70c5a056013f91b6ceb3bc88b3b180

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page