Skip to main content

A CLI command that generates a fresh Git Repository with files and configs to optimize version control for Jupyter Notebooks

Project description

Create Jupyter Git

A CLI command that generates a fresh Git repository with files and configs to optimize version control for Jupyter Notebooks

Description

A common use of Jupyter Notebooks is for learning, taking notes, and having code examples that you can modify and run later. Other commons uses include proving out data analysis or machine learning which generates a lot of output in the form of images and data. In both of these scenarios, the output changes frequently and it not as important as the notebook configuration. The output can easily be regenerated for many use cases.

The output generated by notebooks is a great candidate for to be ignored in a Git repoitory so commits are minimal and point to meaningful code and not data that is derived from that code.

There are a number of methods one can take to version control your Jupyter Notebooks and ignore the output.

One of the best ( documented here ) is to utilize a Git filter to target *.ipynb files and strip out the output field in the json before it gets staged.

This approach requires a few steps that you may not be interested in or may want to have to deal with when setting up a new repo for Jupyter Notebooks so this CLI command can be used to create and initialize a Git repository with configs already in place. Simply startup your Jupyter Notebooks and commit when you hit a meaningful checkpoint.

Installation

Install the CLI

pip install create-jupyter-git

Usage

Run the CLI and specify in the path to where you want your NEW Git repository created

create-jupyter-git <new repository path>

This repository will have a .gitignore to ensure checkpoints aren't versioned. It also creates a .gitattributes with a configuration for filtering and then adds .git/config values to utilize the Python scripts that handle the filtering via git filter clean.

Start Jupyter

cd <new repository path>
jupyter lab notebooks

Start Jupyter with .venv

This setup is great for pulling in dependencies just for your Notebooks that don't clutter your global or personal python library space.

Setup your .venv and allow your global or user Jupyter install to be utilized.

cd <new repository path>
python3 -m venv .venv --system-site-packages

Activate the .venv:

source .venv/bin/activate

Add your .venv as a Juypyter kernel

python -m ipykernel install --user --name=.venv

Start the Jupyter Lab

jupyter lab notebooks

Commit Your Changes

You can create directories, notebooks, and fill your notebooks with wonderful code and generate beautiful output. When you are at a meaningful spot in your development simply do a git commit. The Git configurations that are inplace will filter out all output within your notebook files and stage them.

If you push up to a remote repository like GitHub, you will see that the output fields in your notebooks are empty! Great!

You will also notice GitHub does some cool magic to re-generate the output in a preview format for you when you view a *.ipynb file. So you can still see the output in GitHub without storing it in your source. Neat!

Development

Publishing

First bump the version

bumpversion --current-version x.x.x <major | minor | patch> setup.py create_jupyter_git/__init__.py

Next generate the distribution files

python setup.py sdist bdist_wheel

Validate the package

twine check dist/*

Upload the package for publication

twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

create-jupyter-git-0.0.2.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

create_jupyter_git-0.0.2-py3-none-any.whl (20.6 kB view details)

Uploaded Python 3

File details

Details for the file create-jupyter-git-0.0.2.tar.gz.

File metadata

  • Download URL: create-jupyter-git-0.0.2.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.2

File hashes

Hashes for create-jupyter-git-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8ae770aebfadaae60db84b9257a6cdf5fdee6c65314e5630925e48de772bac25
MD5 943a34935628d91c0abba6047820a69b
BLAKE2b-256 8448a0ae7188852dd34aadf917e06df51854af97bf75be40e98ea25fc174066d

See more details on using hashes here.

File details

Details for the file create_jupyter_git-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: create_jupyter_git-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 20.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.2

File hashes

Hashes for create_jupyter_git-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9bef11283816928458e89a30ec69bebc9d07af5664be27fb2523ed7f39b14ea5
MD5 2799b6207dc3139803466a0fea5458c9
BLAKE2b-256 b2dde0a95430558f2a19f9f80f7b6e64f5ca6e5c3cafffb3d4d1763dbe35de04

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page