Skip to main content

Jupyter notebook linter for hidden-state and execution-order bugs.

Project description

nborder

A fast, opinionated linter and auto-fixer for Jupyter notebook hidden-state and execution-order bugs.

nborder catches four classes of notebook bug in one pass

PyPI version CI Python License: MIT

What this catches

Code Name One-line example
NB101 Non-monotonic execution counts Cell 1 ran with In [3]: after cell 0 ran with In [5]:.
NB102 Won't survive Restart-and-Run-All print(df) references a name no cell in the notebook defines.
NB201 Use-before-assign across cells Cell 0 uses df; df = ... only appears in cell 1.
NB103 Stochastic library used without seed np.random.rand(3) runs with no seed call before it.

Each rule has a docs page under docs/rules/ explaining the bug class, a bad and good example, and the auto-fix behaviour. The four sections below walk through each rule with the diagnostic nborder actually emits.

NB101: out-of-order execution

The execution_count field on each cell records the order Jupyter actually ran cells in, not the order they appear in the file. When those orders disagree, the recorded outputs reflect a hidden state nobody can recreate. NB101 walks the counts once and fires when they decrease.

NB101 diagnostic on a non-monotonic notebook

--fix=clear-counts resets every execution_count to null so subsequent runs cannot disagree with the source order.

NB201: use before define

Variables used in cell N but only defined in some later cell N+k. The notebook works in the user's live kernel because the later cell was run first; on Restart-and-Run-All it raises NameError. This is the failure mode that hides during interactive authoring and surfaces in CI or for the first reviewer.

Real Jupyter NameError when the broken notebook runs top-to-bottom

--fix=reorder topologically sorts the cells when the dependency graph is a DAG. After the fix the same Restart-and-Run-All executes cleanly.

nborder check on the same notebook flags NB201 and the precursor NB101

NB102: undefined name

A name used somewhere in the notebook is never defined by any cell. The kernel works in interactive use because the user happened to define the name from a deleted cell; from a fresh kernel, NameError fires. NB102 has no auto-fix in 0.1.x because there is no general way to guess the intended name; the user fixes it by hand.

NB102 catches a typo`d reference before the runtime does

NB103: unseeded stochastic call

Stochastic library APIs that run before any seed call. Reruns produce different numbers, downstream metrics drift, and reviewers cannot reproduce the notebook's claimed results. The 0.1.4 numpy injection seeds both the legacy global RandomState and the modern Generator API in two lines, so reruns of a fixed notebook are byte-identical regardless of which API the user actually calls.

NB103 fires on np.random.rand(3) before any seed is set

After --fix, two nbconvert --execute runs produce identical floats:

Two seeded reruns of the fixed notebook produce identical output

Quick start

pip install nborder
nborder check notebook.ipynb
nborder check --fix notebook.ipynb
nborder check --output-format=json notebook.ipynb

The --fix flag reorders cells topologically when the dependency graph is a DAG, injects library-appropriate seed calls for stochastic libraries, and clears execution counts when they no longer reflect a reproducible run order. Every fix is a pipeline stage with a bailed outcome that does not block other fixes; running the same fix twice is a byte-stable no-op.

Pre-commit

Add this to your .pre-commit-config.yaml:

repos:
  - repo: https://github.com/moonrunnerkc/nborder
    rev: v0.1.4
    hooks:
      - id: nborder

Then pre-commit install. Full setup notes in docs/integrations/pre-commit.md.

GitHub Actions

- uses: moonrunnerkc/nborder@v0.1.4
  with:
    path: notebooks/

Diagnostics show up as inline annotations on the PR. Full options in docs/integrations/github-actions.md.

Configuration

nborder reads its configuration from [tool.nborder] in pyproject.toml:

[tool.nborder.seeds]
value = 42
libraries = ["numpy", "torch", "tensorflow", "random"]

Run nborder config to print the effective merged configuration.

Filtering rules with --select

--select=<CODES> keeps only the listed rule codes in the diagnostic output. Useful for adopting a subset of rules, scoping a fix-up commit, or running a single rule in CI. Unknown codes exit 2 with a clear error.

--select=NB201 filters NB101 out of the same notebook

Rule documentation

nborder rule <CODE> prints the rule's full documentation from the installed wheel. Useful when triaging a finding without leaving the terminal.

nborder rule NB101 renders the packaged rule docs

FAQ

Why not use ruff? Ruff lints Python source, not notebook structure. It does not see cross-cell dataflow, so it cannot detect that df is used in cell 0 and only defined in cell 1. nborder is purpose-built for the cross-cell story; it is complementary to ruff, not competitive.

How is this different from nbqa? nbqa runs Python linters against notebook cells one at a time. nborder builds a cross-cell symbol dependency graph and reasons about the relationships between cells, which is the part nbqa explicitly does not do.

Does it work with R or Julia notebooks? Not in v0.1. Multi-language support is reserved for a future release. Python kernels cover roughly 95% of notebooks in the wild.

Will it modify my notebook outputs? No. Outputs, cell metadata, and notebook-level metadata are read-only. The only fix that touches execution_count is a clear-to-null operation that runs as part of the reorder fix.

What about magics? %line, %%cell, !shell, and shell-assignment forms (files = !ls) are stripped to typed metadata before parsing. Magic-defined bindings (e.g., %%capture out defines out) are recorded in the dataflow graph; see docs/known-limitations.md for the limits.

How do I suppress a false positive? Add # nborder: noqa (suppress all rules in the cell) or # nborder: noqa: NB201,NB102 (suppress specific rules) to any line in the cell.

What if I want to scope nborder to a subset of rules? Use --select=<CODES> on the CLI. For example, nborder check --select=NB201,NB103 notebooks/ keeps only NB201 and NB103 diagnostics; NB101 and NB102 are filtered out. Cell-level suppression with # nborder: noqa: NB201 still works for one-off exceptions.

Is there a pyproject.toml block for rule selection? Not yet. v0.2 adds a [tool.nborder] select table so a project can pin its rule set in version control. The CLI flag is the supported path until then.

What is on the roadmap? v0.2 adds the [tool.nborder] select configuration table and the opt-in fresh-kernel --reproduce pass. The static linter remains the default path.

Contributing

See CONTRIBUTING.md.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nborder-0.1.4.tar.gz (56.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nborder-0.1.4-py3-none-any.whl (50.7 kB view details)

Uploaded Python 3

File details

Details for the file nborder-0.1.4.tar.gz.

File metadata

  • Download URL: nborder-0.1.4.tar.gz
  • Upload date:
  • Size: 56.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nborder-0.1.4.tar.gz
Algorithm Hash digest
SHA256 e7650236f46237f74964ba4da4eca0c51c5cfc227a02acfb3ec9ea0f3db76493
MD5 a0409fa38fe6cede78c37bb96ebd7b7c
BLAKE2b-256 80e63a379068ad1b60b7dd305bf7bb0ed33bfcf9fb9c47974181ec275b936982

See more details on using hashes here.

Provenance

The following attestation bundles were made for nborder-0.1.4.tar.gz:

Publisher: release.yml on moonrunnerkc/nborder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nborder-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: nborder-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 50.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nborder-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2489fde8b1d9c7f1e63d5603984bd282da9bc0e4f71c22f0e94259ec9874bf74
MD5 e8a3436fb436b41443bc0e7994559763
BLAKE2b-256 fc62d4e1c6c09433c8e0bf01b9f6215a6cf8330f0ee1e12c2eb09b5a495f0e65

See more details on using hashes here.

Provenance

The following attestation bundles were made for nborder-0.1.4-py3-none-any.whl:

Publisher: release.yml on moonrunnerkc/nborder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page