A comprehensive, modern CLI toolkit that solves all major Jupyter notebook pain points in one unified interface.

These details have not been verified by PyPI

Project links

Project description

nbctl

The Swiss Army Knife for Jupyter Notebooks

A comprehensive, production-ready CLI toolkit for Jupyter notebooks that solves all major pain points: version control, collaboration, code quality, security, and workflow automation.

Features

Clean - Remove outputs and metadata for git
Info - Analyze notebook statistics and dependencies
Export - Convert to HTML, PDF, Markdown, Python, etc.
Extract - Extract outputs (images, graphs, data) from notebooks
ML-Split - Split ML notebooks into production Python pipelines
Run - Execute notebooks from command line
Lint - Check code quality and best practices
Format - Auto-format with black
Git Setup - Configure git for notebooks
Diff - Compare notebooks intelligently
Combine - Concatenate notebooks
Resolve - 3-way merge with conflict detection (powered by nbdime)
Security - Find security vulnerabilities

Installation

pip install nbctl

Or install from source:

git clone https://github.com/VenkatachalamSubramanianPeriyaSubbu/nbctl.git
cd nbctl
pip install -e .

Quick Start

Clean notebooks for git

nbctl clean notebook.ipynb

Removes: Outputs, execution counts, metadata Result: Smaller files, cleaner diffs, fewer conflicts

Get notebook insights

nbctl info notebook.ipynb

Shows: Statistics, code metrics, dependencies, imports

Scan for security issues

nbctl security notebook.ipynb

Detects: Hardcoded secrets, SQL injection, unsafe pickle, and more

Extract outputs from notebooks

nbctl extract notebook.ipynb

Extracts: Images (PNG, JPEG, SVG), data (JSON, CSV, DataFrames) Saves to: outputs/data/ and outputs/images/

Split ML notebook into Python pipeline

nbctl ml-split ml_notebook.ipynb
cd ml_pipeline && python main.py

Creates: Production-ready Python modules with automatic context passing

Compare notebooks

nbctl diff notebook1.ipynb notebook2.ipynb

Compares: Only source code (ignores outputs/metadata)

Resolve merge conflicts

nbctl resolve base.ipynb ours.ipynb theirs.ipynb -o merged.ipynb

Uses: nbdime's intelligent 3-way merge with conflict detection

📚 Commands Reference

`nbutils clean`

Remove outputs and metadata from notebooks for version control.

nbutils clean notebook.ipynb [OPTIONS]

Options:

--output, -o PATH - Save to different file
--keep-outputs - Preserve cell outputs
--keep-execution-count - Preserve execution counts
--keep-metadata - Preserve metadata
--dry-run - Preview changes without modifying

Examples:

# Clean in place
nbutils clean notebook.ipynb

# Preview changes
nbutils clean notebook.ipynb --dry-run

# Save to new file
nbutils clean notebook.ipynb -o clean.ipynb

`nbutils info`

Display comprehensive notebook statistics and analysis.

nbutils info notebook.ipynb [OPTIONS]

Options:

--code-metrics - Show only code metrics
--imports - Show only import statements

Shows:

Cell counts (code, markdown, raw)
File size
Code metrics (lines, complexity, empty cells)
All import statements and dependencies

Examples:

# Full analysis
nbutils info notebook.ipynb

# Just imports
nbutils info notebook.ipynb --imports

`nbutils export`

Convert notebooks to multiple formats simultaneously.

nbutils export notebook.ipynb --format FORMATS [OPTIONS]

Supported Formats:

html - HTML document
pdf - PDF (requires LaTeX)
markdown, md - Markdown
python, py - Python script
latex, tex - LaTeX
rst - reStructuredText
slides - Reveal.js presentations

Options:

--format, -f - Output formats (comma-separated, required)
--output-dir, -o - Output directory
--no-input - Exclude input cells
--no-prompt - Exclude prompts

Examples:

# Export to multiple formats
nbutils export notebook.ipynb -f html,pdf,py

# Export without input cells
nbutils export notebook.ipynb -f html --no-input

# Export presentation
nbutils export notebook.ipynb -f slides

`nbutils extract`

Extract outputs (images, graphs, data) from notebook cells.

nbutils extract notebook.ipynb [OPTIONS]

Features:

Extract data: JSON, CSV, HTML tables (DataFrames), text
Extract images: PNG, JPEG, SVG (matplotlib plots, graphs)
Organized folders: outputs/data/ and outputs/images/
Traceable filenames: cell_{idx}_output_{idx}_type_{counter}.ext

Options:

--output, -o PATH - Output directory (default: outputs/)
--data - Extract only data outputs
--images - Extract only image outputs
--all - Extract all outputs without prompting

Interactive Mode:

# Prompts you to choose: both/data/images/all
nbutils extract notebook.ipynb

Examples:

# Interactive mode
nbutils extract ml_analysis.ipynb

# Extract everything
nbutils extract ml_analysis.ipynb --all

# Only images (plots, graphs)
nbutils extract ml_analysis.ipynb --images

# Only data (CSV, JSON, DataFrames)
nbutils extract ml_analysis.ipynb --data

# Custom output directory
nbutils extract ml_analysis.ipynb --output my_outputs/

Output Structure:

outputs/
├── data/
│ ├── cell_0_output_0_data_0.json
│ ├── cell_1_output_0_data_1.html # DataFrame
│ └── cell_2_output_0_data_2.csv
└── images/
 ├── cell_3_output_0_img_0.png # Matplotlib plot
 ├── cell_4_output_0_img_1.svg # Vector graphic
 └── cell_5_output_0_img_2.jpeg

`nbutils ml-split`

Split ML notebooks into production-ready Python pipeline modules.

nbutils ml-split notebook.ipynb [OPTIONS]

Features:

Intelligent section detection - Recognizes 7 ML workflow patterns
Context passing - Variables flow between pipeline steps
Complete package - Generates __init__.py + main.py runner
Auto-dependencies - Creates requirements.txt from imports

Detected Sections:

Data Collection
Data Preprocessing/Cleaning
Feature Engineering
Data Splitting (train/test)
Model Training
Model Evaluation
Model Saving

Options:

--output, -o PATH - Output directory (default: ml_pipeline/)
--create-main - Create main.py runner (default: True)

Examples:

# Split ML notebook into pipeline
nbutils ml-split ml_notebook.ipynb

# Custom output directory
nbutils ml-split ml_notebook.ipynb --output src/ml/

# Run the generated pipeline
cd ml_pipeline
python main.py

Generated Structure:

ml_pipeline/
├── data_collection.py # Module for each section
├── data_preprocessing.py
├── feature_engineering.py
├── data_splitting.py
├── model_training.py
├── model_evaluation.py
├── model_saving.py
├── __init__.py # Package init
├── main.py # Pipeline runner
└── requirements.txt # Auto-generated deps

How It Works:

Analyzes markdown headers in your notebook
Groups code cells by ML workflow section
Generates Python modules with run(context) functions
Creates main.py that executes the entire pipeline
Variables pass automatically between steps

Each Module:

def run(context=None):
 """Execute pipeline step with context from previous steps"""
 # Your notebook code here
 return locals() # Pass variables to next step

Main Pipeline:

# Executes all steps in sequence
context = data_collection.run()
context = data_preprocessing.run(context) # Gets 'df' from step 1
context = feature_engineering.run(context) # Gets 'df' from step 2
# ... and so on

`nbutils run`

Execute Jupyter notebooks from the command line.

nbutils run notebook1.ipynb notebook2.ipynb [OPTIONS]

Features:

Execute notebooks in specified or alphabetical order
No timeout by default (perfect for long ML training)
Save executed notebooks with all outputs
Detailed execution summary
Error handling and reporting

Options:

--order - Run notebooks in alphabetical order
--timeout, -t INT - Timeout per cell in seconds (default: None)
--allow-errors - Continue execution even if cells fail
--save-output, -o PATH - Directory to save executed notebooks
--kernel, -k TEXT - Kernel name to use (default: python3)

Examples:

# Run single notebook
nbutils run analysis.ipynb

# Run multiple notebooks in specified order
nbutils run 01_load.ipynb 02_process.ipynb 03_analyze.ipynb

# Run all notebooks alphabetically
nbutils run *.ipynb --order

# Save executed notebooks to directory
nbutils run *.ipynb --save-output executed/

# Continue on errors
nbutils run notebook.ipynb --allow-errors

# Set timeout for safety (e.g., prevent infinite loops)
nbutils run notebook.ipynb --timeout 600

Execution Summary:

Execution Summary

┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┓
┃ Notebook        ┃ Status  ┃ Time  ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━┩
│ 01_load.ipynb   │ Success │ 2.3s  │
│ 02_process.ipynb│ Success │ 5.1s  │
│ 03_analyze.ipynb│ Success │ 3.7s  │
└─────────────────┴─────────┴───────┘

Total: 3 notebooks | Successful: 3 | Total time: 11.1s

Use Cases:

Execute ML training notebooks overnight
Run data pipelines in sequence
Automate report generation
Batch process multiple notebooks
CI/CD notebook testing

`nbutils lint`

Check code quality and identify issues.

nbutils lint notebook.ipynb [OPTIONS]

Checks:

Unused imports
Overly long cells
Empty code cells
Code quality issues

Options:

--max-cell-length INT - Max lines per cell (default: 100)

Examples:

# Standard linting
nbutils lint notebook.ipynb

# Custom cell length limit
nbutils lint notebook.ipynb --max-cell-length 150

`nbutils format`

Auto-format code cells with black.

nbutils format notebook.ipynb [OPTIONS]

Options:

--output-dir, -o - Output directory
--line-length INT - Max line length (default: 88)

Examples:

# Format in place
nbutils format notebook.ipynb

# Custom line length
nbutils format notebook.ipynb --line-length 100

`nbutils git-setup`

Configure git for optimal notebook workflows.

nbutils git-setup

Configures:

.gitattributes for notebook handling
.gitignore for Python projects
Custom diff driver using nbutils
Custom merge driver using nbutils

Run once per repository to enable git integration.

`nbutils diff`

Compare notebooks intelligently (ignores outputs and metadata).

nbutils diff notebook1.ipynb notebook2.ipynb [OPTIONS]

Options:

--format, -f - Output format: table, unified, json (default: table)
--code-only - Show only code cell changes
--stats - Show only statistics

Features:

Ignores outputs and metadata
Focuses on actual code changes
Multiple output formats

Examples:

# Table view (default)
nbutils diff old.ipynb new.ipynb

# Unified diff format
nbutils diff old.ipynb new.ipynb --format unified

# Show only code changes
nbutils diff old.ipynb new.ipynb --code-only

# JSON output for automation
nbutils diff old.ipynb new.ipynb --format json

`nbutils combine`

Concatenate or combine two notebooks.

nbutils combine notebook1.ipynb notebook2.ipynb -o output.ipynb [OPTIONS]

Strategies:

append - Concatenate all cells from both (default)
first - Keep only first notebook
second - Keep only second notebook

Options:

--output, -o - Output file (required)
--strategy - Combine strategy
--report - Show detailed report

Examples:

# Concatenate notebooks
nbutils combine analysis1.ipynb analysis2.ipynb -o full.ipynb

# Keep only first notebook (copy)
nbutils combine nb1.ipynb nb2.ipynb -o output.ipynb --strategy first

Note: For true merging with conflict detection, use nbutils resolve.

`nbutils resolve`

Intelligent 3-way merge with conflict detection (powered by nbdime).

nbutils resolve base.ipynb ours.ipynb theirs.ipynb -o merged.ipynb [OPTIONS]

Arguments:

BASE - Common ancestor (before changes)
OURS - Your version (local changes)
THEIRS - Other version (remote changes)

Options:

--output, -o - Output file (required unless --check-conflicts)
--strategy - Merge strategy: auto, ours, theirs, cell-append
--check-conflicts - Check for conflicts only (no output file needed)
--report - Show detailed merge report

Features:

Production-grade merging with nbdime
Automatic conflict detection
Conflict markers for manual resolution
Multiple merge strategies

Examples:

# Check for conflicts first
nbutils resolve base.ipynb ours.ipynb theirs.ipynb --check-conflicts

# Perform merge
nbutils resolve base.ipynb ours.ipynb theirs.ipynb -o merged.ipynb

# Use with Git
git show :1:notebook.ipynb > base.ipynb
git show :2:notebook.ipynb > ours.ipynb
git show :3:notebook.ipynb > theirs.ipynb
nbutils resolve base.ipynb ours.ipynb theirs.ipynb -o notebook.ipynb

`nbutils security`

Scan notebooks for security vulnerabilities.

nbutils security notebook.ipynb [OPTIONS]

Detects:

HIGH: Hardcoded secrets (API keys, passwords, tokens)
HIGH: Unsafe pickle deserialization
HIGH: SQL injection risks
MEDIUM: Command injection (os.system, eval, exec)
MEDIUM: Unsafe YAML parsing
MEDIUM: Disabled SSL verification
LOW: Weak cryptographic algorithms (MD5, SHA1)

Options:

--severity - Filter by severity: low, medium, high, all (default: all)
--json - Output as JSON
--verbose, -v - Show detailed recommendations

Examples:

# Scan for all issues
nbutils security notebook.ipynb

# Only high severity
nbutils security notebook.ipynb --severity high

# With recommendations
nbutils security notebook.ipynb --verbose

# JSON output for CI/CD
nbutils security notebook.ipynb --json

Common Workflows

Setting up a new repository

# 1. Configure git for notebooks
nbutils git-setup

# 2. Clean notebooks before committing
nbutils clean *.ipynb

# 3. Check code quality
nbutils lint notebook.ipynb
nbutils format notebook.ipynb

# 4. Scan for security issues
nbutils security notebook.ipynb

Reviewing notebook changes

# Compare versions
nbutils diff old.ipynb new.ipynb --format unified

# Check what changed (code only)
nbutils diff old.ipynb new.ipynb --code-only

Resolving merge conflicts

# Check if there are conflicts
nbutils resolve base.ipynb ours.ipynb theirs.ipynb --check-conflicts

# Perform merge
nbutils resolve base.ipynb ours.ipynb theirs.ipynb -o merged.ipynb --report

# If conflicts exist, manually resolve in the merged file

Pre-commit checks

# Quality checks
nbutils lint notebook.ipynb
nbutils format notebook.ipynb
nbutils security notebook.ipynb --severity high

# Clean for commit
nbutils clean notebook.ipynb

ML Workflow - From Notebook to Production

# 1. Develop ML model in notebook
# (work on ml_model.ipynb)

# 2. Extract outputs for reports
nbutils extract ml_model.ipynb --images
# → Gets all plots and visualizations

# 3. Split into production pipeline
nbutils ml-split ml_model.ipynb --output ml_pipeline/

# 4. Test the pipeline
cd ml_pipeline
python main.py

# 5. Deploy the pipeline modules
# Each module is a standalone Python file ready for production!

Development

Setup

# Clone repository
git clone https://github.com/yourusername/nbutils.git
cd nbutils

# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e ".[dev]"

Run Tests

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_security.py -v

# With coverage
pytest tests/ --cov=nbutils --cov-report=html

Code Quality

# Format code
black nbutils/ tests/

# Type checking
mypy nbutils/

Why nbutils?

Jupyter notebooks are powerful but have challenges:

Problem	nbutils Solution
Massive git diffs	`clean` - Remove outputs
Merge conflicts	`resolve` - Intelligent 3-way merge
Hard to compare	`diff` - Smart comparison
Code quality issues	`lint` + `format`
Security risks	`security` - Vulnerability scanning
Manual workflows	Comprehensive CLI automation

One tool. All solutions. Production-ready.

Roadmap

Basic clean command
Info command (statistics, metrics, imports)
Export command (HTML, PDF, Markdown, etc.)
Extract command (extract outputs, images, data)
ML-Split command (ML notebook → Python pipeline)
Lint command (code quality)
Format command (black auto-format)
Git setup (integration)
Diff command (intelligent comparison)
Combine command (2-way merge)
Resolve command (3-way merge with nbdime)
Security command (vulnerability scanning)
Test runner (execute and validate)
Split command (general notebook splitting)
Template system
Cloud integration

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License - see LICENSE file for details.

Author

Built with for the Jupyter community by Venkatachalam Subramanian Periya Subbu

Status

Version: 0.1.2 Status: Production-ready with comprehensive test coverage New: Extract outputs & ML pipeline splitting

** Star this repo if you find it useful!**

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.3

Nov 13, 2025

0.1.2

Nov 13, 2025

0.1.1

Nov 13, 2025

0.1.0

Nov 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nbctl-0.1.3.tar.gz (49.7 kB view details)

Uploaded Nov 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nbctl-0.1.3-py3-none-any.whl (43.1 kB view details)

Uploaded Nov 13, 2025 Python 3

File details

Details for the file nbctl-0.1.3.tar.gz.

File metadata

Download URL: nbctl-0.1.3.tar.gz
Upload date: Nov 13, 2025
Size: 49.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for nbctl-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`0499ac2106509b565529fae19a3d5387de95fd11b4e327478611bd749bd296c7`
MD5	`519da4dcb012180b1d8a5ede4d74a676`
BLAKE2b-256	`39557fcfb2f07b983cdf137aeba4e0407710edbfd72a1c10bcd93ac07b113182`

See more details on using hashes here.

File details

Details for the file nbctl-0.1.3-py3-none-any.whl.

File metadata

Download URL: nbctl-0.1.3-py3-none-any.whl
Upload date: Nov 13, 2025
Size: 43.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for nbctl-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d0853600a1ea1cf79817df9f71a742774ede013852cf94c5480202c06815dfc4`
MD5	`2e600f523e20679c584acb2751eb8149`
BLAKE2b-256	`d1d0eaa47e8ea0ec2f1595763371dee7448d21fb2b6ae9aa217d0ac12178cc63`

See more details on using hashes here.

nbctl 0.1.3

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

nbctl

Links

Features

Installation

Quick Start

Clean notebooks for git

Get notebook insights

Scan for security issues

Extract outputs from notebooks

Split ML notebook into Python pipeline

Compare notebooks

Resolve merge conflicts

📚 Commands Reference

nbutils clean

nbutils info

nbutils export

nbutils extract

nbutils ml-split

nbutils run

nbutils lint

nbutils format

nbutils git-setup

nbutils diff

nbutils combine

nbutils resolve

nbutils security

Common Workflows

Setting up a new repository

Reviewing notebook changes

Resolving merge conflicts

Pre-commit checks

ML Workflow - From Notebook to Production

Development

Setup

Run Tests

Code Quality

Why nbutils?

Roadmap

Contributing

License

Author

Status

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`nbutils clean`

`nbutils info`

`nbutils export`

`nbutils extract`

`nbutils ml-split`

`nbutils run`

`nbutils lint`

`nbutils format`

`nbutils git-setup`

`nbutils diff`

`nbutils combine`

`nbutils resolve`

`nbutils security`