Generate exercise versions of Jupyter notebooks
Project description
ipynb-scrubber
Generate exercise versions of Jupyter notebooks by clearing solution cells and removing instructor-only content.
[!NOTE] This is a project made to satisfy a need on some personal projects. The behaivor has been tested to work for these projects but will not be supported for other uses.
Issues will be reviewed if opened, and any legitimate bugs will be fixed, but new features or ideas will likely be rejected unless accompanied by a working pull request with comprehensive tests.
Thanks for understanding.
Features
- Clear solution cells: Replace cell contents with placeholder text while preserving structure
- Custom replacement text: Use cell-specific text instead of default placeholder
- All cell types supported: Works with code, markdown, and raw cells
- Remove cells entirely: Omit instructor-only cells from the output
- Multiple syntax options: Use cell tags or cell-type-appropriate comment syntax
- Preserve structure: Maintain notebook structure and metadata
- Clear all outputs: Remove all cell outputs and execution counts for a clean slate
- Simple CLI: Unix-style tool that reads from stdin and writes to stdout
Installation
Install with a python package manager like pip or uv:
pip install ipynb-scrubber
Usage
The tool takes a notebook on stdin and will write the scrubbed version to
stdout:
ipynb-scrubber < input.ipynb > output.ipynb
Options
--clear-tag TAG: Tag marking cells to clear (default:scrub-clear)--clear-text TEXT: Replacement text for cleared cells where unspecified (default:# TODO: Implement this)--omit-tag TAG: Tag marking cells to omit entirely (default:scrub-omit)
Examples
Using default settings:
ipynb-scrubber < lecture.ipynb > exercise.ipynb
Using custom tags:
ipynb-scrubber --clear-tag solution --omit-tag private < lecture.ipynb > exercise.ipynb
Using custom placeholder text:
ipynb-scrubber --clear-text "# YOUR CODE HERE" < lecture.ipynb > exercise.ipynb
Marking Cells
There are two ways to mark cells for processing:
1. Cell Tags (All Cell Types)
Add tags to cells using Jupyter's tag interface. This works for all cell types (code, markdown, raw):
- Add
scrub-cleartag to solution cells that should be cleared - Add
scrub-omittag to cells that should be removed entirely
2. Source-Based Options (Code & Markdown)
Use cell-type-appropriate syntax for more control, including custom replacement text:
Code Cells - Quarto Options
#| scrub-clear
def secret_solution():
return 42
# Or with custom replacement text:
#| scrub-clear: # WRITE YOUR SOLUTION HERE
def another_solution():
return "hidden"
# To omit entirely:
#| scrub-omit
# This cell will be removed
print("Instructor only!")
Markdown Cells - HTML Comments
<!-- scrub-clear -->
## Answer
The solution is 42 because...
<!-- Or with custom replacement text: -->
<!-- scrub-clear: **Write your answer here** -->
## Another Question
This answer will be replaced.
<!-- To omit entirely: -->
<!-- scrub-omit -->
## Instructor Notes
These notes are only for the instructor.
Raw Cells - Tags Only
Raw cells only support metadata tags to avoid format conflicts:
# Cell metadata: {"tags": ["scrub-clear"]}
$$\int_0^1 x^2 dx = \frac{1}{3}$$
# Cell metadata: {"tags": ["scrub-omit"]}
% This LaTeX comment will be omitted entirely
Custom Replacement Text
When using source-based options, you can specify custom text to replace the cleared content:
#| scrub-clear: Your custom text(code cells)<!-- scrub-clear: Your custom text -->(markdown cells)- Empty text:
#| scrub-clear:(results in empty cell)
If no custom text is provided, the default --clear-text value is used.
Example
Input Notebook
Code Cell 1 (no tags):
# Instructions - this will remain unchanged
print("Exercise: implement the functions below")
Code Cell 2 (Quarto option with custom text):
#| scrub-clear: # TODO: Write your add function here
def add(a, b):
return a + b
result = add(1, 2)
print(f"Result: {result}")
Markdown Cell 3 (HTML comment):
<!-- scrub-clear: **Write your explanation here** -->
## Solution Explanation
The add function works by using the + operator...
Code Cell 4 (cell tag - will be omitted):
# Cell has metadata: {"tags": ["scrub-omit"]}
# This cell will be removed entirely
assert add(1, 2) == 3
print("Tests pass!")
Output Notebook
Code Cell 1 (unchanged):
# Instructions - this will remain unchanged
print("Exercise: implement the functions below")
Code Cell 2 (cleared with custom text):
# TODO: Write your add function here
Markdown Cell 3 (cleared with custom text):
**Write your explanation here**
Code Cell 4 (omitted entirely)
Behavior
- All cell outputs are cleared: Every cell has its output and execution count removed
- Tagged cells are processed:
- Cells with the clear tag have their source code replaced with placeholder text
- Cells with the omit tag are removed entirely from the output
- Notebook metadata: An
exercise_versionflag is added to the notebook metadata - Error handling: Invalid notebooks produce helpful error messages
License
Apache License 2.0
Contributing
Contributions are welcome! Please feel free to submit a Pull Request, but note that comprehensive test coverage and clear justification for why the request should be considered (keeping in mind new features increase the maintenance burden) should be included.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ipynb_scrubber-0.2.0.tar.gz.
File metadata
- Download URL: ipynb_scrubber-0.2.0.tar.gz
- Upload date:
- Size: 33.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3547ab87b4f231257932c9a2f80eaba7cf6dfe2225a7db3e0bd792b5b408c0a4
|
|
| MD5 |
902b68347aa800f0252d3612c4e3a223
|
|
| BLAKE2b-256 |
c4b5dd4948e4283e044e92f9b81bd70f7f3731609f63791354aa2cec9b09ef13
|
Provenance
The following attestation bundles were made for ipynb_scrubber-0.2.0.tar.gz:
Publisher:
release.yml on jkeifer/ipynb-scrubber
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ipynb_scrubber-0.2.0.tar.gz -
Subject digest:
3547ab87b4f231257932c9a2f80eaba7cf6dfe2225a7db3e0bd792b5b408c0a4 - Sigstore transparency entry: 437896692
- Sigstore integration time:
-
Permalink:
jkeifer/ipynb-scrubber@21ad9709c8eb7008b463920a62d5b894e0f4a46e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/jkeifer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@21ad9709c8eb7008b463920a62d5b894e0f4a46e -
Trigger Event:
release
-
Statement type:
File details
Details for the file ipynb_scrubber-0.2.0-py3-none-any.whl.
File metadata
- Download URL: ipynb_scrubber-0.2.0-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c952d3b401c0f03ce3351bd5cd652df13ac01cdf439273a3f641caf5b9fab2f7
|
|
| MD5 |
e0476be542c764043cab79a541d9382c
|
|
| BLAKE2b-256 |
c7662436ff825fde43be6906d00d7a847ec48f0e302f1ea1d2ad0a95d4758202
|
Provenance
The following attestation bundles were made for ipynb_scrubber-0.2.0-py3-none-any.whl:
Publisher:
release.yml on jkeifer/ipynb-scrubber
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ipynb_scrubber-0.2.0-py3-none-any.whl -
Subject digest:
c952d3b401c0f03ce3351bd5cd652df13ac01cdf439273a3f641caf5b9fab2f7 - Sigstore transparency entry: 437896728
- Sigstore integration time:
-
Permalink:
jkeifer/ipynb-scrubber@21ad9709c8eb7008b463920a62d5b894e0f4a46e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/jkeifer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@21ad9709c8eb7008b463920a62d5b894e0f4a46e -
Trigger Event:
release
-
Statement type: