A package manager for Jupyter notebook templates

These details have not been verified by PyPI

Project description

notebookpkg

A notebook template manager for ML students.
One command installs a ready-to-run Jupyter notebook â€” already wired to your dataset, with your column names, your target, and your drop columns injected automatically.

No more writing the same boilerplate for every assignment. Just pick a template, point it to your CSV, and open Jupyter.

Installation

pip install notebookpkg

Requirements: Python 3.7+, pandas, scikit-learn, matplotlib, seaborn, nbformat, click

How It Works

You run one command with your CSV file
The tool reads your dataset and detects all column names and types
It injects your dataset path, column names, target column, and drop columns into the template
A .ipynb file is created in your current folder
Open it in Jupyter and run all cells â€” everything is pre-filled

Quick Start

# Step 1: See all available templates
notebookpkg list

# Step 2: Install a template for your CSV
notebookpkg install linear-regression --dataset Salary_Data.csv --target Salary

# Step 3: Open the notebook
jupyter notebook linear-regression_notebook.ipynb

Commands

`notebookpkg list`

Lists all available templates with their descriptions.

notebookpkg list

Output:

ðŸ“¦ Available Templates:

  decision-tree                       Decision Tree: criterion=entropy, max_depth=5, plot_tree, accuracy, report
  eda-basic                           Basic EDA: head, shape, info, describe, nulls, dtypes, nunique
  eda-full                            Full EDA: visual + outliers, skewness, duplicates, value counts
  eda-visual                          Visual EDA: pairplot, heatmap, distributions
  kmeans-clustering                   KMeans Clustering: StandardScaler, elbow method, silhouette score, cluster plot
  knn-classifier                      KNN Classifier: StandardScaler, fit, accuracy, confusion matrix, report
  lasso-ridge                         Linear + Lasso + Ridge Regression with StandardScaler and coefficient plots
  linear-regression                   Linear Regression: EDA, fit, predict, visualize, MSE, RÂ²
  logistic-regression                 Logistic Regression: StandardScaler, fit, accuracy, confusion matrix, report
  multi-model-compare                 LR + KNN + Naive Bayes on same dataset with accuracy comparison
  naive-bayes                         Gaussian Naive Bayes: StandardScaler, fit, accuracy, confusion matrix heatmap
  polynomial-regression               Polynomial Regression: PolynomialFeatures, smooth curve plot, MSE, RÂ²
  random-forest-classifier            Random Forest Classifier: model1, accuracy, confusion matrix, feature importance
  random-forest-regressor             Random Forest Regressor: RFR, fit, MSE, RÂ², Actual vs Predicted scatter
  svm-classifier                      SVM: Linear kernel, then RBF kernel with AgeSalary feature engineering

`notebookpkg install`

Installs a template wired to your dataset.

notebookpkg install <template-name> --dataset <path-to-csv> [options]

All options:

Option	Required	Default	Description
`--dataset`	Yes	â€”	Path to your CSV file
`--target`	No	Last column	Target/label column name
`--drop`	No	None	Columns to drop, comma-separated
`--degree`	No	`2`	Polynomial degree â€” only for `polynomial-regression`
`--clusters`	No	`3`	Number of clusters â€” only for `kmeans-clustering`
`--output`	No	`<template>_notebook.ipynb`	Custom output filename

`notebookpkg syntax`

Prints the complete code of a template â€” every cell in order â€” directly in your terminal. Use this to preview exactly what will be generated before installing.

notebookpkg syntax <template-name>

Example:

notebookpkg syntax logistic-regression

Output:

============================================================
  Template : logistic-regression
  Logistic Regression: StandardScaler, fit, accuracy, confusion matrix, report
  Total cells: 16
============================================================

â”€â”€ Cell 1 â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

â”€â”€ Cell 2 â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
df = pd.read_csv('{{DATASET_PATH}}')
df.head()

â”€â”€ Cell 3 â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
{{DROP_CODE}}

... (all remaining cells shown in full)

============================================================
  Install this template:
  notebookpkg install logistic-regression --dataset yourdata.csv
============================================================

You can run syntax for any of the 15 templates:

notebookpkg syntax eda-basic
notebookpkg syntax eda-visual
notebookpkg syntax eda-full
notebookpkg syntax linear-regression
notebookpkg syntax polynomial-regression
notebookpkg syntax logistic-regression
notebookpkg syntax knn-classifier
notebookpkg syntax naive-bayes
notebookpkg syntax lasso-ridge
notebookpkg syntax decision-tree
notebookpkg syntax random-forest-regressor
notebookpkg syntax random-forest-classifier
notebookpkg syntax svm-classifier
notebookpkg syntax kmeans-clustering
notebookpkg syntax multi-model-compare

Templates

EDA Templates

`eda-basic`

Basic Exploratory Data Analysis. Covers the essential checks every notebook needs.

Cells generated:

Imports
df.read_csv() + df.head()
Drop columns cell (optional)
df.shape
df.info()
df.describe()
df.isnull().sum()
df.dtypes
df.nunique()

notebookpkg install eda-basic --dataset data.csv

`eda-visual`

EDA with all key visualizations.

Cells generated: Everything in eda-basic, plus:

sns.pairplot(df)
Correlation heatmap (df.corr() + sns.heatmap())
Histogram for each numeric column

notebookpkg install eda-visual --dataset data.csv

`eda-full`

Complete EDA including outlier detection and categorical analysis.

Cells generated: Everything in eda-visual, plus:

df.duplicated().sum()
Boxplot for each numeric column
Skewness: df.skew(numeric_only=True)
IQR outlier count for each numeric column
value_counts() for each categorical column

notebookpkg install eda-full --dataset data.csv

Regression Templates

`linear-regression`

Standard Linear Regression pipeline on your CSV.

Cells generated:

Imports
Load dataset + head
Drop columns cell
shape, info, describe, isnull
pairplot
Correlation heatmap
X / y split (iloc)
train_test_split (test_size=0.2, random_state=0)
regressor = LinearRegression() + fit
Predict
Visualize training data (scatter + regression line)
Visualize testing data
Coefficient and intercept
MSE
RÂ²

notebookpkg install linear-regression --dataset Salary_Data.csv --target Salary

`polynomial-regression`

Polynomial Regression with smooth curve visualization.

Cells generated:

Imports (includes PolynomialFeatures)
Load dataset + head
Drop columns cell
info, describe, pairplot, heatmap
X / y split
PolynomialFeatures(degree=N) + transform
train_test_split
plr = LinearRegression() + fit
Smooth curve plot using X_gride
Predict
MSE
RÂ²

notebookpkg install polynomial-regression --dataset hw.csv --target Price
notebookpkg install polynomial-regression --dataset hw.csv --target Price --degree 3

`lasso-ridge`

Linear Regression + Lasso + Ridge, all on the same dataset with comparison.

Cells generated:

Imports
Load + EDA (info, describe, columns, shape)
Drop columns cell
X / y split
train_test_split
StandardScaler
Linear Regression (lm) + coefficient barh plot
Lasso (alpha=0.1) + MSE + RÂ² + coefficient barh plot
Ridge (alpha=0.1) + MSE + RÂ²

notebookpkg install lasso-ridge --dataset BostonHousing.csv --target medv

Classification Templates

`logistic-regression`

Logistic Regression with StandardScaler.

Cells generated:

Imports
Load dataset + head
Drop columns cell
shape, info, describe, isnull
Correlation heatmap
X / y split
train_test_split (test_size=0.3, random_state=0)
sc = StandardScaler() + fit_transform / transform
lr = LogisticRegression() + fit
Predict
Accuracy score
Confusion matrix
Classification report

notebookpkg install logistic-regression --dataset Day5.csv --target Purchased
notebookpkg install logistic-regression --dataset Day5.csv --target Purchased --drop "User ID,Gender"

`knn-classifier`

K-Nearest Neighbors Classifier with StandardScaler.

Cells generated:

Imports
Load dataset + head
Drop columns cell
shape, info, describe, isnull, duplicated
Correlation heatmap + pairplot
X / y split
train_test_split (test_size=0.2, random_state=42)
StandardScaler
knn = KNeighborsClassifier() + fit
Predict
Accuracy, confusion matrix, classification report

notebookpkg install knn-classifier --dataset Day5.csv --target Purchased

`naive-bayes`

Gaussian Naive Bayes with StandardScaler and confusion matrix heatmap.

Cells generated:

Imports
Load + shape, describe, isnull
Drop columns cell
Correlation heatmap
X / y split
train_test_split with stratify=y
StandardScaler (fit only on train)
nb = GaussianNB() + fit
Predict
Accuracy
Classification report
Confusion matrix as sns.heatmap

notebookpkg install naive-bayes --dataset Day5.csv --target Purchased

`decision-tree`

Decision Tree Classifier with tree visualization.

Cells generated:

Imports (includes from sklearn import tree)
Load + EDA
Drop columns cell
Distribution plot + heatmap + pairplot
X / y split
train_test_split
StandardScaler
DecisionTreeClassifier(criterion='entropy', max_depth=5, random_state=0)
Predict
Accuracy score
Confusion matrix
Classification report
tree.plot_tree() â€” full visual tree diagram

notebookpkg install decision-tree --dataset SNP.csv --target Purchased

`svm-classifier`

SVM with both Linear and RBF kernels, plus feature engineering.

Cells generated:

Imports (includes SVC)
Load + EDA (info, describe, isnull, value_counts)
Drop columns cell
Scatter plot of features
X / y split
train_test_split
StandardScaler
model = SVC(kernel='linear') + fit + predict + accuracy + CM + heatmap
Feature engineering: df['AgeSalary'] = df['Age'] * df['EstimatedSalary']
Re-split with new feature
model1 = SVC(kernel='rbf') + fit + predict + accuracy + CM + heatmap

notebookpkg install svm-classifier --dataset SNP.csv --target Purchased

`multi-model-compare`

Runs Logistic Regression, KNN, and Naive Bayes on the same dataset and compares accuracy.

Cells generated:

Imports
Load + EDA
Drop columns cell
X / y split
train_test_split
model_lr = LogisticRegression() â†’ fit â†’ predict â†’ accuracy â†’ report
model_knn = KNeighborsClassifier() â†’ fit â†’ predict â†’ accuracy â†’ report
model_nb = GaussianNB() â†’ fit â†’ predict â†’ accuracy â†’ report
Comparison dict with all three accuracy scores printed together

notebookpkg install multi-model-compare --dataset Day5.csv --target Purchased

Ensemble Templates

`random-forest-regressor`

Random Forest Regressor with actual vs predicted scatter plot.

Cells generated:

Imports
Load + isnull, duplicated, info, describe
Drop columns cell
Correlation heatmap
X / y split
train_test_split (test_size=0.2, random_state=42)
RFR = RandomForestRegressor(n_estimators=100, random_state=42) + fit
Predict
MSE
RÂ²
Scatter plot: Actual vs Predicted

notebookpkg install random-forest-regressor --dataset housing.csv --target Price

`random-forest-classifier`

Random Forest Classifier with feature importance bar chart.

Cells generated:

Imports
Load + EDA
Drop columns cell
X / y split
train_test_split
StandardScaler
model1 = RandomForestClassifier(n_estimators=100, random_state=42) + fit
Predict
Accuracy
Classification report
Confusion matrix heatmap
Feature importance: model1.feature_importances_
Bar chart of feature importance

notebookpkg install random-forest-classifier --dataset iris.csv --target species

Clustering Templates

`kmeans-clustering`

KMeans Clustering with elbow method and silhouette score. No target column needed.

Cells generated:

Imports (includes KMeans, silhouette_score)
Load + shape, info, describe, isnull, duplicated
Drop columns cell
pairplot
Correlation heatmap
StandardScaler on numeric columns
Elbow method loop (k=1 to 9) + inertia plot
KMeans(n_clusters=N) + fit
Cluster labels added to df
Cluster scatter plot with centroids marked in red
Silhouette score

notebookpkg install kmeans-clustering --dataset Mall_Customers.csv
notebookpkg install kmeans-clustering --dataset Mall_Customers.csv --clusters 5

The `--drop` Option

Many real datasets have ID columns, name columns, or other columns that should not go into the model. Use --drop to remove them before anything is processed.

With --drop, the generated notebook gets:

df = df.drop(columns=['User ID', 'Gender'], axis=1)
df.head()

Without --drop, the cell appears as a comment so you can still do it manually:

# No columns dropped
# To drop columns use: df = df.drop(columns=['col1','col2'], axis=1)

The profiler also respects the drop â€” column detection for NUMERIC_COLS, CAT_COLS, and FEATURE_COLS all happen after the drop, so the rest of the notebook is consistent.

# Drop one column
notebookpkg install knn-classifier --dataset Day5.csv --target Purchased --drop "User ID"

# Drop multiple columns
notebookpkg install logistic-regression --dataset Day5.csv --target Purchased --drop "User ID,Gender"

All Usage Examples

# â”€â”€ EDA â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
notebookpkg install eda-basic   --dataset data.csv
notebookpkg install eda-visual  --dataset data.csv
notebookpkg install eda-full    --dataset data.csv

# â”€â”€ Regression â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
notebookpkg install linear-regression      --dataset Salary_Data.csv --target Salary
notebookpkg install polynomial-regression  --dataset hw.csv --target Price --degree 3
notebookpkg install lasso-ridge            --dataset BostonHousing.csv --target medv

# â”€â”€ Classification â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
notebookpkg install logistic-regression    --dataset Day5.csv --target Purchased
notebookpkg install knn-classifier         --dataset Day5.csv --target Purchased
notebookpkg install naive-bayes            --dataset Day5.csv --target Purchased
notebookpkg install decision-tree          --dataset SNP.csv  --target Purchased
notebookpkg install svm-classifier         --dataset SNP.csv  --target Purchased
notebookpkg install multi-model-compare    --dataset Day5.csv --target Purchased

# â”€â”€ Ensemble â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
notebookpkg install random-forest-regressor   --dataset housing.csv --target Price
notebookpkg install random-forest-classifier  --dataset iris.csv    --target species

# â”€â”€ Clustering â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
notebookpkg install kmeans-clustering --dataset Mall_Customers.csv
notebookpkg install kmeans-clustering --dataset Mall_Customers.csv --clusters 5

# â”€â”€ With drop â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
notebookpkg install logistic-regression --dataset Day5.csv --target Purchased --drop "User ID,Gender"

# â”€â”€ Custom output filename â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€
notebookpkg install linear-regression --dataset Salary_Data.csv --target Salary --output my_analysis.ipynb

Project Structure

notebookpkg/
â”œâ”€â”€ notebookpkg/
â”‚   â”œâ”€â”€ cli.py          # CLI commands: install, list
â”‚   â”œâ”€â”€ profiler.py     # Reads CSV, detects column types
â”‚   â”œâ”€â”€ injector.py     # Replaces tokens in notebook cells
â”‚   â”œâ”€â”€ registry.py     # Finds templates by name
â”‚   â””â”€â”€ templates/
â”‚       â”œâ”€â”€ eda-basic/
â”‚       â”œâ”€â”€ eda-visual/
â”‚       â”œâ”€â”€ eda-full/
â”‚       â”œâ”€â”€ linear-regression/
â”‚       â”œâ”€â”€ polynomial-regression/
â”‚       â”œâ”€â”€ logistic-regression/
â”‚       â”œâ”€â”€ knn-classifier/
â”‚       â”œâ”€â”€ naive-bayes/
â”‚       â”œâ”€â”€ lasso-ridge/
â”‚       â”œâ”€â”€ decision-tree/
â”‚       â”œâ”€â”€ random-forest-regressor/
â”‚       â”œâ”€â”€ random-forest-classifier/
â”‚       â”œâ”€â”€ svm-classifier/
â”‚       â”œâ”€â”€ kmeans-clustering/
â”‚       â””â”€â”€ multi-model-compare/
â”œâ”€â”€ build_templates.py  # Regenerates all .ipynb template files
â”œâ”€â”€ setup.py
â”œâ”€â”€ MANIFEST.in
â””â”€â”€ README.md

Each template folder contains:

template.ipynb â€” the notebook with {{TOKEN}} placeholders
meta.json â€” name, description, and whether a target column is needed

Dependencies

pandas
numpy
scikit-learn
matplotlib
seaborn
nbformat
click

These are installed automatically when you run pip install notebookpkg.

Author

Priyansu Pattanaik
B.Tech â€” Electronics & Telecommunication
PG Diploma in AI â€” CDAC Kharghar
priyansupattanaikwork@gmail.com

License

MIT License. Free to use, modify, and distribute.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

3.0.0

May 6, 2026

2.0.0

May 5, 2026

This version

1.4.0

May 4, 2026

1.3.0

May 4, 2026

1.2.1

May 2, 2026

1.2.0

May 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

notebookpkg-1.4.0.tar.gz (26.2 kB view details)

Uploaded May 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

notebookpkg-1.4.0-py3-none-any.whl (35.0 kB view details)

Uploaded May 4, 2026 Python 3

File details

Details for the file notebookpkg-1.4.0.tar.gz.

File metadata

Download URL: notebookpkg-1.4.0.tar.gz
Upload date: May 4, 2026
Size: 26.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for notebookpkg-1.4.0.tar.gz
Algorithm	Hash digest
SHA256	`63b5bccb050f8331e9d7381b1bd74914ba2f587acf45bb4a74bd4fb5d1e98e26`
MD5	`612289630d4804f5d4963b48b59f6581`
BLAKE2b-256	`3232a2ce339ec1ae33cff28cb72f1b854d36e9d0e084a91092870ba5293cda6e`

See more details on using hashes here.

File details

Details for the file notebookpkg-1.4.0-py3-none-any.whl.

File metadata

Download URL: notebookpkg-1.4.0-py3-none-any.whl
Upload date: May 4, 2026
Size: 35.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for notebookpkg-1.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5b6b8c43d2ef5251e7fd7c4ca8ce9f12800dfa0f714467530b4641c85d00106a`
MD5	`03b8696041192e936980b784794177d4`
BLAKE2b-256	`adfb4afec3e374a770b1169e262a028fd45365267deff06434e784d70b067f24`

See more details on using hashes here.

notebookpkg 1.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

notebookpkg

Installation

How It Works

Quick Start

Commands

notebookpkg list

notebookpkg install

notebookpkg syntax

Templates

EDA Templates

eda-basic

eda-visual

eda-full

Regression Templates

linear-regression

polynomial-regression

lasso-ridge

Classification Templates

logistic-regression

knn-classifier

naive-bayes

decision-tree

svm-classifier

multi-model-compare

Ensemble Templates

random-forest-regressor

random-forest-classifier

Clustering Templates

kmeans-clustering

The --drop Option

All Usage Examples

Project Structure

Dependencies

Author

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`notebookpkg list`

`notebookpkg install`

`notebookpkg syntax`

`eda-basic`

`eda-visual`

`eda-full`

`linear-regression`

`polynomial-regression`

`lasso-ridge`

`logistic-regression`

`knn-classifier`

`naive-bayes`

`decision-tree`

`svm-classifier`

`multi-model-compare`

`random-forest-regressor`

`random-forest-classifier`

`kmeans-clustering`

The `--drop` Option