Dump files to Jupyter notebook.
Project description
nbdump
Dump files to Jupyter notebook. Restore by running the notebook. Add optional extra commands to run.
Installation
# user
pip install -U nbdump
# development
pip install -e .
pip install tests/requirements.txt
pytest
Usage
In this demo, we will use src_example/
as a fake repo that you want to import to notebook.
CLI
# see help
nbdump -h
# basic usage, this will dump entire `src_example/` to `nb1.ipynb`
nbdump src_example -o nb1.ipynb
# use shell expansion, this will come in handy later
nbdump src_example/**/*.py -o nb2.ipynb
# handle multiple files/dirs, will be deduplicated
nbdump src_example src_example/main.py -o nb3.ipynb
# append extra code cell, e.g. running the `src_example/main.py`
nbdump src_example -c '%run src_example/main.py' -o nb4.ipynb
# extra cells can be more than one
nbdump src_example \
-c '%run src_example/main.py' \
-c '!git status' \
-o nb5.ipynb
# use fd to skip ignored files and hidden files
nbdump $(fd -t f . src_example) -o nb6.ipynb
# clone metadata from another notebook
nbdump src_example/**/*.py -o nb7.ipynb -m tests/kaggle/modified/modified-notebook.ipynb
There is a catch, nbdump
will not respect gitignore because the core functionality is just converting a bunch of files to notebook cells. This means, by using the first example on nb1.ipynb
, nbdump
will try to convert all files recursively, regardless of file format. The problem arises when src_example/
contains binary files such as pictures or even __pycache__/*
.
Then shell expansion can be used to only select relevant files, such as the example on nb2.ipynb
(make sure to enable globstar in bash to use **
). Another solution is to use other tools like fd to list the files while respecting gitignore and skipping hidden files automatically.
Library
from pathlib import Path
import nbdump
target_files = list(Path("src_example").rglob("*.py"))
codes = ["!ls -lah", "!git log --oneline", "%run src_example/main.py"]
metadata_notebook = "tests/kaggle/modified/modified-notebook.ipynb"
# save to disk
with open("nb8.ipynb", "w") as f:
nbdump.dump(f, target_files, codes, metadata_notebook)
# save as string
ipynb = nbdump.dumps(target_files, codes, metadata_notebook)
print(ipynb[:50])
Why?
Kaggle kernel with code competition type with disabled internet cannot use git clone inside the notebook. nbdump
allows one to work in a standard environment but the final result can be exported to a single notebook, while still preserving the filesystem tree.
This is different than just zipping and unzipping because by using %%writefile
, you can see and edit the file inside, even after the notebook creation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file nbdump-0.0.3.tar.gz
.
File metadata
- Download URL: nbdump-0.0.3.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29fe4fb6ea0038490cef09f62a1f3251b8c95d791779efb9e73ecb36cd68f911 |
|
MD5 | f53927e03af1cd7c61819e4253b6fc8b |
|
BLAKE2b-256 | 26744edc6a6de5447235facad758034acbb54a07f88bcd8de61ad628458a9113 |
File details
Details for the file nbdump-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: nbdump-0.0.3-py3-none-any.whl
- Upload date:
- Size: 6.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6be5e11d68621d4971946acb236dc3759428ef74b67afba415f472ac98ee938 |
|
MD5 | 482dc4dd5e9fc40eed79cbe45fbf803c |
|
BLAKE2b-256 | be07a1fad87fed92da3d4a794001909cbe8bea8c2fd2dad2c98587bf1dd94353 |