Skip to main content

A linter for Jupyter notebooks written in Python.

Project description

Logo

PyPI version PyPI - Python Version

CI Documentation Status codecov License: MIT code style

Many professional data scientists use Jupyter Notebook to accomplish their daily tasks, from preliminary data exploration to model prototyping. Notebooks' interactivity is particularly convenient for data-centric programming and their self-documenting nature provides excellent support for the communication of analytical results.

Nevertheless, Jupyter Notebook has been often criticized for inducing bad programming habits and scarcely supporting Software Engineering best practices. To really benefit from notebooks, users should be aware of their common pitfalls and learn how to prevent them.

In previous work (see "Eliciting Best Practices for Collaboration with Computational Notebooks" [1]), we introduced a catalog of 17 empirically-validated guidelines for the collaborative use of notebooks in a professional context.

To foster the adoption of these best practices, we have created Pynblint, a static analysis tool for Jupyter notebooks written in Python. Pynblint reveals potential notebook defects and recommends corrective actions. It can be operated either as a standalone CLI application or as part of a CI/CD pipeline.

Pynblint screens

The core linting rules of Pynblint have been derived as operationalizations of the best practices from our catalog. Nonetheless, the plug-in architecture of Pynblint enables its users to easily extend the core set of checks with their own linting rules.

Requirements

Python 3.7+.

Installation

Pynblint can be installed with pip or another PyPI package manager:

pip install pynblint

After installation, we recommend exploring the command-line interface of the tool:

pynblint --help

Usage

Pynblint can be used to analyze:

  • a standalone notebook:

    pynblint path/to/the/notebook.ipynb
    
  • a code repository containing notebooks:

    pynblint path/to/the/project/dir/
    
    • (possibly also compressed as a .zip archive):

      pynblint path/to/the/compressed/archive.zip
      
  • a public GitHub repository containing notebooks (support for private repositories is on our roadmap 🙂):

    pynblint --from-github https://github.com/collab-uniba/pynblint
    

For further information on the available options, please refer to the project documentation.

Catalog of best practices

In the following, we report the catalog of empirically-validated best practices on which Pynblint is based [1].

For each guideline, we specify the current state of implementation within Pynblint:

  • :white_check_mark: = "implemented"
  • :hourglass_flowing_sand: = "partially implemented / work in progress"
  • :x: = "not on our roadmap"
State Best Practice from [1]
:white_check_mark: Use version control
:white_check_mark: Manage project dependencies
:hourglass_flowing_sand: Use self-contained environments
:white_check_mark: Put imports at the beginning
:white_check_mark: Ensure re-executability (re-run notebooks top to bottom)
:hourglass_flowing_sand: Modularize your code
:hourglass_flowing_sand: Test your code
:white_check_mark: Name your notebooks consistently
:hourglass_flowing_sand: Stick to coding standards
:hourglass_flowing_sand: Use relative paths
:white_check_mark: Document your analysis
:white_check_mark: Leverage Markdown headings to structure your notebook
:white_check_mark: Keep your notebook clean
:white_check_mark: Keep your notebook concise
:x: Distinguish production and development artifacts
:hourglass_flowing_sand: Make your notebooks available
:white_check_mark: Make your data available

License

This project is licensed under the terms of the MIT license.

References

[1] Luigi Quaranta, Fabio Calefato, and Filippo Lanubile. 2022. Eliciting Best Practices for Collaboration with Computational Notebooks. Proc. ACM Hum.-Comput. Interact. 6, CSCW1, Article 87 (April 2022), 41 pages. https://doi.org/10.1145/3512934

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pynblint-0.1.6.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

pynblint-0.1.6-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file pynblint-0.1.6.tar.gz.

File metadata

  • Download URL: pynblint-0.1.6.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.10.2 Darwin/23.5.0

File hashes

Hashes for pynblint-0.1.6.tar.gz
Algorithm Hash digest
SHA256 10853c0fc9bf84b85e9227c132f59f9af539771623a3aac99d0d6ababd45e601
MD5 72d70fbbf92e5405b4a5b096728177dd
BLAKE2b-256 9c1d85cb08bcb43446ec4c0237bb0718f705c91eae02ad3c729950db9afea20a

See more details on using hashes here.

File details

Details for the file pynblint-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: pynblint-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.10.2 Darwin/23.5.0

File hashes

Hashes for pynblint-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 8bb972696431144768ba6bf238a83f646c3faa4dac2810338ef87fb24d91742c
MD5 774995888a600ccb8b6795cb453aa310
BLAKE2b-256 86309bcd030408ae80e3a516da13834065d667798a622309fb891d50e77d30d6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page