A CLI command that generates a fresh Git Repository with files and configs to optimize version control for Jupyter Notebooks
Project description
Create Jupyter Git
A CLI command that generates a fresh Git repository with files and configs to optimize version control for Jupyter Notebooks
Description
A common use of Jupyter Notebooks is for learning, taking notes, and having code examples that you can modify and run later. Other commons uses include proving out data analysis or machine learning which generates a lot of output in the form of images and data. In both of these scenarios, the output changes frequently and it not as important as the notebook configuration. The output can easily be regenerated for many use cases.
The output generated by notebooks is a great candidate for to be ignored in a Git
repoitory so commits are minimal and point to meaningful code and not data that is derived from that code.
There are a number of methods one can take to version control your Jupyter Notebooks and ignore the output.
One of the best ( documented here ) is to utilize a Git
filter to target *.ipynb
files and strip out the output
field in the json before it gets staged.
This approach requires a few steps that you may not be interested in or may want to have to deal with when setting up a new repo for Jupyter Notebooks so this CLI command can be used to create and initialize a Git
repository with configs already in place. Simply startup your Jupyter Notebooks and commit when you hit a meaningful checkpoint.
Installation
Install the CLI
pip install create-jupyter-git
Usage
Run the CLI and specify in the path to where you want your NEW Git repository created
create-jupyter-git <new repository path>
This repository will have a .gitignore
to ensure checkpoints aren't versioned. It also creates a .gitattributes
with a configuration for filtering and then adds .git/config
values to utilize the Python scripts that handle the filtering via git filter clean
.
Start Jupyter
cd <new repository path>
jupyter lab notebooks
Start Jupyter with .venv
This setup is great for pulling in dependencies just for your Notebooks that don't clutter your global or personal python library space.
Setup your .venv
and allow your global or user Jupyter install to be utilized.
cd <new repository path>
python3 -m venv .venv --system-site-packages
Activate the .venv
:
source .venv/bin/activate
Add your .venv
as a Juypyter kernel
python -m ipykernel install --user --name=.venv
Start the Jupyter Lab
jupyter lab notebooks
Commit Your Changes
You can create directories, notebooks, and fill your notebooks with wonderful code and generate beautiful output. When you are at a meaningful spot in your development simply do a git commit
. The Git configurations that are inplace will filter out all output within your notebook files and stage them.
If you push up to a remote repository like GitHub, you will see that the output fields in your notebooks are empty! Great!
You will also notice GitHub does some cool magic to re-generate the output in a preview format for you when you view a *.ipynb
file. So you can still see the output in GitHub without storing it in your source. Neat!
Development
Publishing
First bump the version
bumpversion --current-version x.x.x <major | minor | patch> setup.py create_jupyter_git/__init__.py
Next generate the distribution files
python setup.py sdist bdist_wheel
Validate the package
twine check dist/*
Upload the package for publication
twine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for create_jupyter_git-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9bef11283816928458e89a30ec69bebc9d07af5664be27fb2523ed7f39b14ea5 |
|
MD5 | 2799b6207dc3139803466a0fea5458c9 |
|
BLAKE2b-256 | b2dde0a95430558f2a19f9f80f7b6e64f5ca6e5c3cafffb3d4d1763dbe35de04 |