Skip to main content

Clean jupyter notebooks. Remove metadata and execution counts.

Project description

nbmetaclean

Collections of python scripts for checking and cleaning Jupyter Notebooks metadata, execution_count and optionally output. Can be used as command line tool or pre-commit hook.

Pure Python, no dependencies.

Can be used as a pre-commit hook or as a command line tool.

PyPI - Python Version PyPI Status Tests Codecov

nbmetaclean

Clean Jupyter Notebooks metadata, execution_count and optionally output.

nbcheck

Check Jupyter Notebooks for errors and (or) warnings in outputs.

Base usage

Pre-commit hook

Nbmetaclean can be used as a pre-commit hook, with pre-commit You do not need to install nbmetaclean, it will be installed automatically. add to .pre-commit-config.yaml:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbmetaclean
        - id: nbcheck
          args: [ --ec, --err, --warn ]

Command line tool

Without install:

If you use uv package manager, you can nbmetaclean without install. To clean notebooks:

uvx nbmetaclean

To check notebooks:

uvx --from nbmetaclean nbcheck --ec --err --warn

Install:

pip install nbmetaclean

Usage: run nbmetaclean or nbcheck command with path to notebook or folder with notebooks. If no path is provided, current directory will be used as path.

It is possible to use nbclean command instead of nbmetaclean. nbmetaclean will be used by defaults in favour of usage with uvx

nbmetaclean

nbcheck should be run with flags:

  • --ec for execution_count check
  • --err for check errors in outputs
  • --warn for check warnings in outputs
nbcheck --ec --err --warn

Nbmetaclean

Default settings

By default, the following settings are used:

  • Clean notebook metadata, except authors and language_info / name.
  • Clean cells execution_count.
  • Preserve metadata at cells.
  • Preserve cells outputs.
  • After cleaning notebook, timestamp for file will be set to previous values.

Arguments

Check available arguments:

nbmetaclean -h

usage: nbmetaclean [-h] [-s] [--not_ec] [--not-pt] [--dont_clear_nb_metadata] [--clear_cell_metadata] [--clear_outputs]
[--nb_metadata_preserve_mask NB_METADATA_PRESERVE_MASK [NB_METADATA_PRESERVE_MASK ...]]
[--cell_metadata_preserve_mask CELL_METADATA_PRESERVE_MASK [CELL_METADATA_PRESERVE_MASK ...]] [--dont_merge_masks] [--clean_hidden_nbs] [-D] [-V]
[path ...]

Clean metadata and execution_count from Jupyter notebooks.

positional arguments:
  path                  Path for nb or folder with notebooks.

options:
  -h, --help            show this help message and exit
  -s, --silent          Silent mode.
  --not_ec              Do not clear execution_count.
  --not-pt              Do not preserve timestamp.
  --dont_clear_nb_metadata
                        Do not clear notebook metadata.
  --clear_cell_metadata
                        Clear cell metadata.
  --clear_outputs       Clear outputs.
  --nb_metadata_preserve_mask NB_METADATA_PRESERVE_MASK [NB_METADATA_PRESERVE_MASK ...]
                        Preserve mask for notebook metadata.
  --cell_metadata_preserve_mask CELL_METADATA_PRESERVE_MASK [CELL_METADATA_PRESERVE_MASK ...]
                        Preserve mask for cell metadata.
  --dont_merge_masks    Do not merge masks.
  --clean_hidden_nbs    Clean hidden notebooks.
  -D, --dry_run         perform a trial run, don't write results
  -V, --verbose         Verbose mode. Print extra information.

Execution_count

If you want to leave execution_count add --not_ec flag at command line or args: [--not_ec] line to .pre-commit-config.yaml.

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbmetaclean
          args: [ --not_ec ]
nbmetaclean --not_ec

Clear outputs

If you want to clear outputs, add --clear_outputs at command line or [ --clean_outputs ] line to .pre-commit-config.yaml.

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbmetaclean
          args: [ --clean_outputs ]
nbmetaclean --clean_outputs

Nbcheck

Check Jupyter Notebooks for correct execution_count, errors and (or) warnings in outputs.

Execution_count

Check that all code cells executed one after another.

Strict mode

By default, execution_count check in strict mode. All cells must be executed, one after another.

pre-commit config example:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbcheck
          args: [ --ec ]

command line example:

nbcheck --ec

Not strict mode

--not_strict flag can be used to check that next cell executed after previous one, but execution number can be more than +1.

pre-commit config example:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbcheck
          args: [ --ec, --not_strict ]

command line example:

nbcheck --ec --not_strict

Allow notebooks with no execution_count

--no_exec flag allows notebooks with all cells without execution_count. If notebook has cells with execution_count and without execution_count, pre-commit will return error.

pre-commit config example:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        - id: nbcheck
          args: [ --ec, --no_exec ]

command line example:

nbcheck --ec --no_exec

Errors and Warnings

--err and --warn flags can be used to check for errors and warnings in outputs.

pre-commit config example:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbcheck
          args: [ --err, --warn ]

command line example:

nbcheck --err --warn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nbmetaclean-0.1.2.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

nbmetaclean-0.1.2-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file nbmetaclean-0.1.2.tar.gz.

File metadata

  • Download URL: nbmetaclean-0.1.2.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for nbmetaclean-0.1.2.tar.gz
Algorithm Hash digest
SHA256 379ae2fc31ae853cee6052852b60d7154ce0020a3d2ad3a3a2f8e817b1a55aff
MD5 70caf4fcf492899e6a8aaca62863db69
BLAKE2b-256 a9c66f8d69231c01c1a550a86650518f6a1545944ed2a09744f99fb9d038c07a

See more details on using hashes here.

File details

Details for the file nbmetaclean-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: nbmetaclean-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for nbmetaclean-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 330e6f157cd5ecc7a6822b8abdc0c652d348b8d18779c167fea70f5b6c8a358f
MD5 f79cc92681b1795b5a992c4003a2b1fa
BLAKE2b-256 6bbb68d0ef7c410dd60072c9e3ddaba48376a7c88d9cb9a9821424d04fdabd7c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page