Skip to main content

Clean jupyter notebooks. Remove metadata and execution counts.

Project description

nbmetaclean

Collections of python scripts for checking and cleaning Jupyter Notebooks metadata, execution_count and optionally output. Can be used as command line tool or pre-commit hook.

Pure Python, no dependencies.

Can be used as a pre-commit hook or as a command line tool.

PyPI - Python Version PyPI Status Tests
Codecov

nbmetaclean

Clean Jupyter Notebooks metadata, execution_count and optionally output.

nbcheck

Check Jupyter Notebooks for errors and (or) warnings in outputs.

Base usage

Pre-commit hook

Nbmetaclean can be used as a pre-commit hook, with pre-commit You do not need to install nbmetaclean, it will be installed automatically. add to .pre-commit-config.yaml:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbmetaclean
        - id: nbcheck
          args: [ --ec, --err, --warn ]

Command line tool

Without install:

If you use uv package manager, you can nbmetaclean without install. To clean notebooks:

uvx nbmetaclean

To check notebooks:

uvx --from nbmetaclean nbcheck --ec --err --warn

Install:

pip install nbmetaclean

Usage: run nbmetaclean or nbcheck command with path to notebook or folder with notebooks. If no path is provided, current directory will be used as path.

It is possible to use nbclean command instead of nbmetaclean. nbmetaclean will be used by defaults in favour of usage with uvx

nbmetaclean

nbcheck should be run with flags:

  • --ec for execution_count check
  • --err for check errors in outputs
  • --warn for check warnings in outputs
nbcheck --ec --err --warn

Nbmetaclean

Default settings

By default, the following settings are used:

  • Clean notebook metadata, except authors and language_info / name.
  • Clean cells execution_count.
  • Preserve metadata at cells.
  • Preserve cells outputs.
  • After cleaning notebook, timestamp for file will be set to previous values.

Arguments

Check available arguments:

nbmetaclean -h

usage: nbmetaclean [-h] [-s] [--not_ec] [--not-pt] [--dont_clear_nb_metadata] [--clear_cell_metadata] [--clear_outputs]
[--nb_metadata_preserve_mask NB_METADATA_PRESERVE_MASK [NB_METADATA_PRESERVE_MASK ...]]
[--cell_metadata_preserve_mask CELL_METADATA_PRESERVE_MASK [CELL_METADATA_PRESERVE_MASK ...]] [--dont_merge_masks] [--clean_hidden_nbs] [-D] [-V]
[path ...]

Clean metadata and execution_count from Jupyter notebooks.

positional arguments:
  path                  Path for nb or folder with notebooks.

options:
  -h, --help            show this help message and exit
  -s, --silent          Silent mode.
  --not_ec              Do not clear execution_count.
  --not-pt              Do not preserve timestamp.
  --dont_clear_nb_metadata
                        Do not clear notebook metadata.
  --clear_cell_metadata
                        Clear cell metadata.
  --clear_outputs       Clear outputs.
  --nb_metadata_preserve_mask NB_METADATA_PRESERVE_MASK [NB_METADATA_PRESERVE_MASK ...]
                        Preserve mask for notebook metadata.
  --cell_metadata_preserve_mask CELL_METADATA_PRESERVE_MASK [CELL_METADATA_PRESERVE_MASK ...]
                        Preserve mask for cell metadata.
  --dont_merge_masks    Do not merge masks.
  --clean_hidden_nbs    Clean hidden notebooks.
  -D, --dry_run         perform a trial run, don't write results
  -V, --verbose         Verbose mode. Print extra information.

Execution_count

If you want to leave execution_count add --not_ec flag at command line or args: [--not_ec] line to .pre-commit-config.yaml.

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbmetaclean
          args: [ --not_ec ]
nbmetaclean --not_ec

Clear outputs

If you want to clear outputs, add --clear_outputs at command line or [ --clean_outputs ] line to .pre-commit-config.yaml.

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbmetaclean
          args: [ --clean_outputs ]
nbmetaclean --clean_outputs

Nbcheck

Check Jupyter Notebooks for correct execution_count, errors and (or) warnings in outputs.

Execution_count

Check that all code cells executed one after another.

Strict mode

By default, execution_count check in strict mode. All cells must be executed, one after another.

pre-commit config example:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbcheck
          args: [ --ec ]

command line example:

nbcheck --ec

Not strict mode

--not_strict flag can be used to check that next cell executed after previous one, but execution number can be more than +1.

pre-commit config example:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbcheck
          args: [ --ec, --not_strict ]

command line example:

nbcheck --ec --not_strict

Allow notebooks with no execution_count

--no_exec flag allows notebooks with all cells without execution_count. If notebook has cells with execution_count and without execution_count, pre-commit will return error.

pre-commit config example:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        - id: nbcheck
          args: [ --ec, --no_exec ]

command line example:

nbcheck --ec --no_exec

Errors and Warnings

--err and --warn flags can be used to check for errors and warnings in outputs.

pre-commit config example:

repos:
    - repo: https://github.com/ayasyrev/nbmetaclean
        rev: 0.1.1
        hooks:
        - id: nbcheck
          args: [ --err, --warn ]

command line example:

nbcheck --err --warn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nbmetaclean-0.1.4.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nbmetaclean-0.1.4-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file nbmetaclean-0.1.4.tar.gz.

File metadata

  • Download URL: nbmetaclean-0.1.4.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for nbmetaclean-0.1.4.tar.gz
Algorithm Hash digest
SHA256 364869ed04dd2c42ff2a57c3869d0a8545bb2ff66187183421243b4e2aaeeeac
MD5 ecc50598dccd9268d8e06ea104843e6b
BLAKE2b-256 0c6891c12764aadfaedf69de4097e1f0314ef7bbbd99d9b7b234750f739109b4

See more details on using hashes here.

File details

Details for the file nbmetaclean-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: nbmetaclean-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for nbmetaclean-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d32391e184005fb9a45483a66aed560f1a01d5033479379723e54127d692773f
MD5 3de89a56f4f23b4d45db6fc11783375d
BLAKE2b-256 80115d7ecbc32530e9a5c4355223b8e53c74e16078ca233088788b40ece9ceb0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page