Clean Jupyter notebooks for versioning
nb-clean cleans Jupyter notebooks of cell execution counts, metadata, outputs,
and (optionally) empty cells, preparing them for committing to version control.
It provides both a Git filter and pre-commit hook to automatically clean
notebooks before they're staged, and can also be used with other version control
systems, as a command line tool, and as a Python library. It can determine if a
notebook is clean or not, which can be used as a check in your continuous
nb-clean 2.0.0 introduced a new command line interface to make
cleaning notebooks in place easier. If you upgrade from a previous release,
you'll need to migrate to the new interface as described under
python3 -m pip install nb-clean
nb-clean can also be installed with Conda:
conda install -c conda-forge nb-clean
In Python projects using Poetry or Pipenv for dependency management, add
nb-clean as a development dependency with
poetry add --dev nb-clean or
pipenv install --dev nb-clean.
nb-clean requires Python 3.7 or later.
You can check if a notebook is clean with:
nb-clean check notebook.ipynb
or by passing the notebook contents on standard input:
nb-clean check < notebook.ipynb
To also check for empty cells, add the
--remove-empty-cells flag. To
ignore cell metadata, add the
--preserve-cell-metadata flag. To ignore
cell outputs, add the
nb-clean will exit with status code 0 if the notebook is clean, and status
code 1 if it is not.
nb-clean will also print details of cell execution
counts, metadata, outputs, and empty cells it finds.
You can clean a Jupyter notebook with:
nb-clean clean notebook.ipynb
This cleans the notebook in place. You can also pass the notebook content on standard input, in which case the cleaned notebook is written to standard output:
nb-clean clean < original.ipynb > cleaned.ipynb
To also remove empty cells, add the
--remove-empty-cells flag. To
preserve cell metadata, add the
--preserve-cell-metadata flag. To
preserve cell outputs, add the
Cleaning (Git filter)
To add a filter to an existing Git repository to automatically clean notebooks when they're staged, run the following from the working tree:
This will configure a filter to remove cell execution counts, metadata, and outputs. To also remove empty cells, use:
nb-clean add-filter --remove-empty-cells
To preserve cell metadata, such as that required by tools such as papermill, use:
nb-clean add-filter --preserve-cell-metadata
To preserve cell outputs, use:
nb-clean add-filter --preserve-cell-outputs
nb-clean will configure a filter in the Git repository in which it is run, and
won't mutate your global or system Git configuration. To remove the filter, run:
Cleaning (pre-commit hook)
nb-clean can also be used as a pre-commit hook. You may prefer this to the
Git filter if your project already uses the pre-commit framework.
Note that the Git filter and pre-commit hook work differently, with different effects on your working directory. The pre-commit hook operates on the notebook on disk, cleaning the copy in your working directory. The Git filter cleans notebooks as they are added to the index, leaving the copy in your working directory dirty. This means cell outputs are still visible to you in your local Jupyter instance when using the Git filter, but not when using the pre-commit hook.
After installing pre-commit, add the
nb-clean hook by adding the following
.pre-commit-config.yaml in the root of your repository:
repos: - repo: https://github.com/srstevenson/nb-clean rev: "2.3.0" hooks: - id: nb-clean
You can pass additional arguments to
nb-clean such as
args array as follows:
repos: - repo: https://github.com/srstevenson/nb-clean rev: "2.3.0" hooks: - id: nb-clean args: - --remove-empty-cells
pre-commit install to ensure the hook is installed, and
pre-commit autoupdate to update the hook to the latest release of
The following table maps from the command line interface of
nb-clean 1.6.0 to
|Clean notebook (remove empty cells)||
|Clean notebook (preserve cell metadata)||
|Clean notebook (preserve cell outputs)||
|Check notebook (remove empty cells)||
|Check notebook (preserve cell metadata)||
|Check notebook (preserve cell outputs)||
|Add Git filter to clean notebooks||
|Remove Git filter||
Copyright © 2017-2022 Scott Stevenson.
nb-clean is distributed under the terms of the ISC licence.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.