A production-ready CLI toolkit for training, evaluating, and tracking Machine Learning and Deep Learning models with experiment tracking, hyperparameter tuning, model explainability, and an interactive TUI

These details have not been verified by PyPI

Project links

Project description

███╗   ███╗██╗      ██████╗██╗     ██╗
████╗ ████║██║     ██╔════╝██║     ██║
██╔████╔██║██║     ██║     ██║     ██║
██║╚██╔╝██║██║     ██║     ██║     ██║
██║ ╚═╝ ██║███████╗╚██████╗███████╗██║
╚═╝     ╚═╝╚══════╝ ╚═════╝╚══════╝╚═╝

🤖 MLCLI - Machine Learning Command Line Interface

A powerful, modular CLI tool for training, evaluating, and tracking ML/DL models

📖 Documentation • 📦 PyPI • Features • Installation • Usage • Configuration • Contributing

mlcli is a modular, configuration-driven command-line tool for training, evaluating, saving, and tracking both Machine Learning and Deep Learning models. It also includes an interactive terminal UI for users who prefer a guided workflow.

🚀 Features

Train ML models:
- Logistic Regression
- SVM
- Random Forest
- XGBoost
Train Deep Learning models:
- TensorFlow DNN
- CNN models
- RNN/LSTM/GRU models
🆕 Hyperparameter Tuning:
- Grid Search
- Random Search
- Bayesian Optimization (Optuna)
🆕 Model Explainability:
- SHAP (SHapley Additive exPlanations)
- LIME (Local Interpretable Model-agnostic Explanations)
- Feature importance visualization
- Instance-level explanations
🆕 Data Preprocessing Pipeline:
- Scaling: StandardScaler, MinMaxScaler, RobustScaler
- Normalization: L1, L2, Max norm
- Encoding: LabelEncoder, OneHotEncoder, OrdinalEncoder
- Feature Selection: SelectKBest, RFE, VarianceThreshold
- Pipeline Support: Chain multiple preprocessors
Unified configuration system (JSON/YAML)
Automatic Model Registry (plug-and-play trainers)
Model saving:
- ML → Pickle, Joblib & ONNX
- DL → SavedModel & H5
Built-in experiment tracker (mini-MLflow with JSON storage)
Interactive terminal UI (TUI)

📁 Project Structure

mlcli/
├── mlcli/
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli.py
│   ├── config/
│   │   ├── __init__.py
│   │   └── loader.py
│   ├── trainers/
│   │   ├── __init__.py
│   │   ├── base_trainer.py
│   │   ├── logistic_trainer.py
│   │   ├── svm_trainer.py
│   │   ├── rf_trainer.py
│   │   ├── xgb_trainer.py
│   │   ├── tf_dnn_trainer.py
│   │   ├── tf_cnn_trainer.py
│   │   └── tf_rnn_trainer.py
│   ├── tuner/                       # Hyperparameter Tuning
│   │   ├── __init__.py
│   │   ├── base_tuner.py
│   │   ├── grid_tuner.py
│   │   ├── random_tuner.py
│   │   └── optuna_tuner.py
│   ├── explainer/                   # 🆕 Model Explainability
│   │   ├── __init__.py
│   │   ├── base_explainer.py
│   │   ├── shap_explainer.py
│   │   ├── lime_explainer.py
│   │   └── explainer_factory.py
│   ├── preprocessor/                # 🆕 Data Preprocessing Pipeline
│   │   ├── __init__.py
│   │   ├── base_preprocessor.py
│   │   ├── scalers.py
│   │   ├── normalizers.py
│   │   ├── encoders.py
│   │   ├── feature_selectors.py
│   │   ├── preprocessor_factory.py
│   │   └── pipeline.py
│   ├── utils/
│   │   ├── __init__.py
│   │   ├── io.py
│   │   ├── metrics.py
│   │   ├── logger.py
│   │   └── registry.py
│   ├── runner/
│   │   ├── __init__.py
│   │   └── experiment_tracker.py
│   ├── ui/
│   │   ├── __init__.py
│   │   ├── app.py
│   │   ├── screens/
│   │   └── widgets/
│   └── models/
├── configs/
├── data/
├── artifacts/
├── logs/
├── runs/
├── scripts/
├── README.md
├── pyproject.toml
└── requirements.txt

🛠️ Complete Setup Guide (From Scratch)

Step 1: Clone the Repository

git clone https://github.com/codeMaestro78/MLcli.git
cd mlcli

Step 2: Create Virtual Environment

Windows (PowerShell):

python -m venv .venv
.\.venv\Scripts\Activate.ps1

Windows (CMD):

python -m venv .venv
.\.venv\Scripts\activate.bat

Linux/macOS:

python -m venv .venv
source .venv/bin/activate

Step 3: Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt

Step 4: Install mlcli in Development Mode

pip install -e .

Step 5: Verify Installation

mlcli --help

Expected Output:

Usage: mlcli [OPTIONS] COMMAND [ARGS]...

  MLCLI - Machine Learning Command Line Interface

Options:
  --help  Show this message and exit.

Commands:
  eval         Evaluate a saved model on test data.
  export-runs  Export experiment runs to CSV.
  list-models  List all available model trainers.
  list-runs    List all experiment runs.
  show-run     Show details of a specific experiment run.
  train        Train a model using a configuration file.
  ui           Launch the interactive terminal UI.

📖 All CLI Commands

1. List Available Models

View all registered model trainers:

mlcli list-models

Output:

Available Model Trainers:
================================================================================
  logistic_regression    Logistic Regression Classifier           [sklearn]
  svm                    Support Vector Machine Classifier        [sklearn]
  random_forest          Random Forest Classifier                 [sklearn]
  xgboost                XGBoost Gradient Boosting Classifier     [xgboost]
  tf_dnn                 TensorFlow Dense Neural Network          [tensorflow]
  tf_cnn                 TensorFlow CNN for Image Classification  [tensorflow]
  tf_rnn                 TensorFlow RNN for Sequence Data         [tensorflow]
================================================================================

2. Train Models

Train with Configuration File

mlcli train --config <path-to-config.json>

Train Logistic Regression

mlcli train --config configs/logistic_config.json

Train Random Forest

mlcli train --config configs/rf_config.json

Train SVM

mlcli train --config configs/svm_config.json

Train XGBoost

mlcli train --config configs/xgb_config.json

Train TensorFlow DNN

mlcli train --config configs/tf_dnn_config.json

Train TensorFlow CNN (for image data)

mlcli train --config configs/tf_cnn_config.json

Train TensorFlow RNN (for sequence data)

mlcli train --config configs/tf_rnn_config.json

Train with Parameter Overrides

mlcli train --config configs/tf_dnn_config.json --epochs 50 --batch-size 64

3. 🆕 Hyperparameter Tuning

Tune model hyperparameters using Grid Search, Random Search, or Bayesian Optimization.

List Available Tuning Methods

mlcli list-tuners

Output:

┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Method   ┃ Name                             ┃ Best For                                     ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ grid     │ Grid Search                      │ Small parameter spaces with discrete values  │
│ random   │ Random Search                    │ Large parameter spaces, continuous params    │
│ bayesian │ Bayesian Optimization (Optuna)   │ Expensive evaluations, complex param spaces  │
└──────────┴──────────────────────────────────┴──────────────────────────────────────────────┘

Tune with Grid Search

mlcli tune --config configs/tune_rf_config.json --method grid --cv 5

Tune with Random Search

mlcli tune --config configs/tune_rf_config.json --method random --n-trials 100 --cv 5

Tune with Bayesian Optimization (Optuna)

mlcli tune --config configs/tune_xgb_config.json --method bayesian --n-trials 200 --scoring accuracy

Tune and Train Best Model

mlcli tune --config configs/tune_rf_config.json --method random --n-trials 50 --train-best

Tune Options

Option	Description
`--config`, `-c`	Path to tuning configuration file
`--method`, `-m`	Tuning method: `grid`, `random`, or `bayesian`
`--n-trials`, `-n`	Number of trials (for random/bayesian)
`--cv`	Number of cross-validation folds
`--scoring`, `-s`	Metric to optimize: `accuracy`, `f1`, `roc_auc`, `precision`, `recall`
`--output`, `-o`	Path to save tuning results (JSON)
`--train-best`	Train a model with best params after tuning

4. 🆕 Model Explainability (SHAP/LIME)

Understand why your models make predictions using SHAP and LIME.

List Available Explainers

mlcli list-explainers

Output:

┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Method ┃ Full Name                                    ┃ Best For                                  ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ shap   │ SHapley Additive exPlanations               │ Tree-based models, global explanations    │
│ lime   │ Local Interpretable Model-agnostic Explanations │ Any model, local explanations         │
└────────┴─────────────────────────────────────────────┴───────────────────────────────────────────┘

Explain Model with SHAP

mlcli explain --model models/rf_model.pkl --data data/train.csv --type random_forest --method shap

Explain Model with LIME

mlcli explain --model models/xgb_model.pkl --data data/train.csv --type xgboost --method lime

Explain with Plot Output

mlcli explain -m models/rf_model.pkl -d data/train.csv -t random_forest -e shap --plot-output feature_importance.png

Explain Single Instance

Understand why a specific prediction was made:

mlcli explain-instance --model models/rf_model.pkl --data data/test.csv --type random_forest --instance 0

mlcli explain-instance -m models/xgb_model.pkl -d data/test.csv -t xgboost -i 5 -e lime

Explainability Options

Option	Description
`--model`, `-m`	Path to saved model file
`--data`, `-d`	Path to data file
`--type`, `-t`	Model type (random_forest, xgboost, logistic_regression)
`--method`, `-e`	Explanation method: `shap` or `lime`
`--num-samples`, `-n`	Number of samples to explain (default: 100)
`--output`, `-o`	Path to save explanation results (JSON)
`--plot/--no-plot`	Generate explanation plot
`--plot-output`, `-p`	Path to save plot (PNG)

Understanding SHAP vs LIME

Feature	SHAP	LIME
Type	Global + Local	Local
Theory	Game Theory (Shapley Values)	Local Surrogate Models
Best For	Tree models (RF, XGBoost)	Any black-box model
Speed	Fast for trees	Slower (samples required)
Consistency	Mathematically consistent	Varies by sampling

5. 🆕 Data Preprocessing

Preprocess your data using various scaling, normalization, encoding, and feature selection methods.

List Available Preprocessors

mlcli list-preprocessors

Output:

┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Method               ┃ Name                ┃ Description                                                     ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Scaling              │                     │                                                                 │
│ standard_scaler      │ StandardScaler      │ Standardize features by removing mean and scaling to unit var   │
│ minmax_scaler        │ MinMaxScaler        │ Scale features to a given range (default 0-1)                   │
│ robust_scaler        │ RobustScaler        │ Scale features using statistics robust to outliers (median/IQR) │
├──────────────────────┼─────────────────────┼─────────────────────────────────────────────────────────────────┤
│ Normalization        │                     │                                                                 │
│ normalizer           │ Normalizer          │ Normalize samples individually to unit norm                     │
│ l1_normalizer        │ L1 Normalizer       │ Normalize samples to L1 norm (sum of absolute values = 1)       │
│ l2_normalizer        │ L2 Normalizer       │ Normalize samples to L2 norm (Euclidean norm = 1)               │
├──────────────────────┼─────────────────────┼─────────────────────────────────────────────────────────────────┤
│ Encoding             │                     │                                                                 │
│ label_encoder        │ LabelEncoder        │ Encode target labels with values between 0 and n_classes-1      │
│ onehot_encoder       │ OneHotEncoder       │ Encode categorical features as one-hot numeric arrays           │
│ ordinal_encoder      │ OrdinalEncoder      │ Encode categorical features as ordinal integers                 │
├──────────────────────┼─────────────────────┼─────────────────────────────────────────────────────────────────┤
│ Feature Selection    │                     │                                                                 │
│ select_k_best        │ SelectKBest         │ Select features according to the k highest scores               │
│ rfe                  │ RFE                 │ Recursive Feature Elimination based on model importance         │
│ variance_threshold   │ VarianceThreshold   │ Remove features with variance below threshold                   │
└──────────────────────┴─────────────────────┴─────────────────────────────────────────────────────────────────┘

Preprocess with StandardScaler

mlcli preprocess --data data/train.csv --output data/train_scaled.csv --method standard_scaler

Preprocess with MinMaxScaler

mlcli preprocess -d data/train.csv -o data/train_minmax.csv -m minmax_scaler --range-min 0 --range-max 1

Preprocess with RobustScaler (outlier-resistant)

mlcli preprocess -d data/train.csv -o data/train_robust.csv -m robust_scaler

Normalize Data (L2 norm)

mlcli preprocess -d data/train.csv -o data/train_norm.csv -m normalizer --norm l2

Feature Selection with SelectKBest

Select top K features based on statistical tests:

mlcli preprocess -d data/train.csv -o data/train_selected.csv -m select_k_best --target label --k 10

Feature Selection with RFE

Recursive Feature Elimination using model importance:

mlcli preprocess -d data/train.csv -o data/train_rfe.csv -m rfe --target label --k 15

Remove Low-Variance Features

mlcli preprocess -d data/train.csv -o data/train_var.csv -m variance_threshold --threshold 0.1

Save Fitted Preprocessor

mlcli preprocess -d data/train.csv -o data/train_scaled.csv -m standard_scaler --save-preprocessor models/scaler.pkl

Apply Preprocessing Pipeline (Multiple Steps)

mlcli preprocess-pipeline --data data/train.csv --output data/processed.csv --steps "standard_scaler,select_k_best" --target label

Preprocessing Options

Option	Description
`--data`, `-d`	Path to input CSV data
`--output`, `-o`	Path to save preprocessed data
`--method`, `-m`	Preprocessing method
`--target`, `-t`	Target column (for feature selection)
`--columns`, `-c`	Specific columns to preprocess
`--k`	Number of features (SelectKBest/RFE)
`--threshold`	Variance threshold
`--norm`	Norm type (l1, l2, max)
`--range-min`, `--range-max`	MinMaxScaler range
`--save-preprocessor`, `-s`	Save fitted preprocessor

Preprocessing Methods Comparison

Method	Best For	Key Feature
StandardScaler	Most ML algorithms	Zero mean, unit variance
MinMaxScaler	Neural networks, bounded outputs	Fixed range (0-1)
RobustScaler	Data with outliers	Uses median/IQR
Normalizer	Text data, similarity measures	Unit norm per sample
SelectKBest	Quick feature filtering	Statistical scoring
RFE	Model-based selection	Iterative importance
VarianceThreshold	Removing constant features	Unsupervised

6. Evaluate Models

Evaluate a saved model on test data:

mlcli eval --model-path <path-to-model> --data-path <path-to-test-data> --model-type <model-type>

Evaluate Pickle Model

mlcli eval --model-path artifacts/model.pkl --data-path data/test.csv --model-type logistic_regression

Evaluate Joblib Model

mlcli eval --model-path artifacts/model.joblib --data-path data/test.csv --model-type random_forest

Evaluate TensorFlow Model (H5)

mlcli eval --model-path artifacts/model.h5 --data-path data/test.csv --model-type tf_dnn

7. Experiment Tracking Commands

List All Experiment Runs

mlcli list-runs

Output:

Experiment Runs:
================================================================================
Run ID                              Model Type           Accuracy    Duration
--------------------------------------------------------------------------------
abc123-def456-789...                random_forest        0.8318      4.2s
xyz789-abc123-456...                xgboost              0.8288      1.1s
...
================================================================================

Show Details of a Specific Run

mlcli show-run <run-id>

Example:

mlcli show-run abc123-def456-789

Export All Runs to CSV

mlcli export-runs --output experiments.csv

8. Interactive Terminal UI

Launch the interactive interface:

mlcli ui

TUI Features:

🎯 Train Model - Select config, model type, and override parameters
📊 Evaluate Model - Load and evaluate saved models
📈 View Experiments - Browse, filter, and export experiment history
🔧 List Models - View all registered trainers with metadata

📝 Configuration Files

Create a Configuration File

Configuration files define the model, dataset, training parameters, and output settings.

Configuration Structure

{
  "model": {
    "type": "<model-type>",
    "params": { ... }
  },
  "dataset": {
    "path": "<path-to-data>",
    "type": "csv",
    "target_column": "<target-column-name>"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["pickle", "joblib"]
  }
}

Example Configurations

Logistic Regression (`configs/logistic_config.json`)

{
  "model": {
    "type": "logistic_regression",
    "params": {
      "penalty": "l2",
      "C": 1.0,
      "solver": "lbfgs",
      "max_iter": 1000
    }
  },
  "dataset": {
    "path": "data/train.csv",
    "type": "csv",
    "target_column": "target"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["pickle", "joblib"]
  }
}

Random Forest (`configs/rf_config.json`)

{
  "model": {
    "type": "random_forest",
    "params": {
      "n_estimators": 100,
      "max_depth": null,
      "min_samples_split": 2,
      "min_samples_leaf": 1,
      "random_state": 42
    }
  },
  "dataset": {
    "path": "data/train.csv",
    "type": "csv",
    "target_column": "target"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["pickle", "joblib"]
  }
}

XGBoost (`configs/xgb_config.json`)

{
  "model": {
    "type": "xgboost",
    "params": {
      "n_estimators": 100,
      "max_depth": 6,
      "learning_rate": 0.1,
      "subsample": 0.8,
      "colsample_bytree": 0.8,
      "early_stopping_rounds": 10,
      "random_state": 42
    }
  },
  "dataset": {
    "path": "data/train.csv",
    "type": "csv",
    "target_column": "target"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["pickle", "joblib"]
  }
}

SVM (`configs/svm_config.json`)

{
  "model": {
    "type": "svm",
    "params": {
      "kernel": "rbf",
      "C": 1.0,
      "gamma": "scale",
      "probability": true
    }
  },
  "dataset": {
    "path": "data/train.csv",
    "type": "csv",
    "target_column": "target"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["pickle", "joblib"]
  }
}

TensorFlow DNN (`configs/tf_dnn_config.json`)

{
  "model": {
    "type": "tf_dnn",
    "params": {
      "layers": [128, 64, 32],
      "activation": "relu",
      "dropout": 0.3,
      "optimizer": "adam",
      "learning_rate": 0.001,
      "epochs": 20,
      "batch_size": 32,
      "early_stopping": true,
      "patience": 5
    }
  },
  "dataset": {
    "path": "data/train.csv",
    "type": "csv",
    "target_column": "target"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["h5", "savedmodel"]
  }
}

🔧 Hyperparameter Tuning Configuration

Tuning configurations include a tuning.param_space section that defines the search space.

Grid Search Configuration

For grid search, use lists of discrete values:

{
  "model": {
    "type": "random_forest",
    "params": {}
  },
  "dataset": {
    "path": "data/train.csv",
    "type": "csv",
    "target_column": "target"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "tuning": {
    "param_space": {
      "n_estimators": [50, 100, 200, 300],
      "max_depth": [5, 10, 15, 20, null],
      "min_samples_split": [2, 5, 10],
      "min_samples_leaf": [1, 2, 4],
      "max_features": ["sqrt", "log2"]
    }
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["pickle", "joblib"]
  }
}

Random/Bayesian Search Configuration

For random and Bayesian search, use distribution specifications:

{
  "model": {
    "type": "xgboost",
    "params": {}
  },
  "dataset": {
    "path": "data/train.csv",
    "type": "csv",
    "target_column": "target"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "tuning": {
    "param_space": {
      "n_estimators": {"type": "int", "low": 50, "high": 500},
      "max_depth": {"type": "int", "low": 3, "high": 15},
      "learning_rate": {"type": "loguniform", "low": 0.01, "high": 0.3},
      "subsample": {"type": "uniform", "low": 0.6, "high": 1.0},
      "colsample_bytree": {"type": "uniform", "low": 0.6, "high": 1.0},
      "min_child_weight": {"type": "int", "low": 1, "high": 10}
    }
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["pickle", "joblib"]
  }
}

Parameter Distribution Types

Type	Description	Example
`list/tuple`	Discrete choices	`[50, 100, 200]`
`int`	Integer range	`{"type": "int", "low": 1, "high": 100}`
`uniform`	Uniform float	`{"type": "uniform", "low": 0.0, "high": 1.0}`
`loguniform`	Log-uniform	`{"type": "loguniform", "low": 0.001, "high": 1.0}`
`categorical`	Choice	`{"type": "categorical", "choices": ["a", "b"]}`

🏨 Real-World Example: Hotel Booking Cancellation Prediction

Step 1: Prepare Your Data

Place your CSV file in the data/ directory:

data/hotel_bookings.csv

Step 2: Preprocess Data (if needed)

Create a preprocessing script scripts/preprocess_data.py:

import pandas as pd
from sklearn.preprocessing import LabelEncoder

# Load data
df = pd.read_csv('data/hotel_bookings.csv')

# Handle missing values
df = df.fillna(0)

# Encode categorical columns
label_encoders = {}
for col in df.select_dtypes(include=['object']).columns:
    if col != 'target_column':
        le = LabelEncoder()
        df[col] = le.fit_transform(df[col].astype(str))
        label_encoders[col] = le

# Save processed data
df.to_csv('data/hotel_bookings_processed.csv', index=False)
print("Preprocessing complete!")

Run preprocessing:

python scripts/preprocess_data.py

Step 3: Create Configuration Files

Create configs/hotel_rf_config.json:

{
  "model": {
    "type": "random_forest",
    "params": {
      "n_estimators": 100,
      "max_depth": null,
      "random_state": 42
    }
  },
  "dataset": {
    "path": "data/hotel_bookings_processed.csv",
    "type": "csv",
    "target_column": "is_canceled"
  },
  "training": {
    "test_size": 0.2,
    "random_state": 42
  },
  "output": {
    "model_dir": "artifacts",
    "save_formats": ["pickle", "joblib"]
  }
}

Step 4: Train the Model

mlcli train --config configs/hotel_rf_config.json

Step 5: View Results

mlcli list-runs

Step 6: Train Multiple Models for Comparison

# Train Logistic Regression
mlcli train --config configs/hotel_logistic_config.json

# Train Random Forest
mlcli train --config configs/hotel_rf_config.json

# Train XGBoost
mlcli train --config configs/hotel_xgb_config.json

# Train TensorFlow DNN
mlcli train --config configs/hotel_dnn_config.json

Step 7: Export Results

mlcli export-runs --output hotel_experiments.csv

📊 Model Comparison Results (Hotel Booking Dataset)

Model	Accuracy	Precision	Recall	F1-Score	AUC-ROC	Training Time
Random Forest 🏆	83.18%	83.80%	83.18%	82.51%	90.90%	4.2s
XGBoost	82.88%	83.31%	82.88%	82.27%	90.45%	1.1s
Logistic Regression	79.90%	81.03%	79.90%	78.68%	85.20%	2.8s
TF DNN	62.43%	38.97%	62.43%	47.99%	50.00%	43.1s

Note: Neural networks require feature standardization for optimal performance.

🧩 Extending mlcli

Adding a New Trainer

Create a new file in mlcli/trainers/:

from mlcli.trainers.base_trainer import BaseTrainer
from mlcli.utils.registry import register_model

@register_model(
    name="my_custom_model",
    description="My Custom Model Trainer",
    framework="custom",
    model_type="classification"
)
class MyCustomTrainer(BaseTrainer):
    def train(self, X_train, y_train, X_val=None, y_val=None):
        # Implementation
        pass

    def evaluate(self, X_test, y_test):
        # Implementation
        pass

    def predict(self, X):
        # Implementation
        pass

    @classmethod
    def get_default_params(cls):
        return {"param1": "value1"}

Import in mlcli/trainers/__init__.py:

from mlcli.trainers.my_custom_trainer import MyCustomTrainer

The model will be automatically registered and available via CLI!

🔧 Troubleshooting

Common Issues

1. "mlcli: command not found"

Solution: Make sure the virtual environment is activated and mlcli is installed:

.\.venv\Scripts\Activate.ps1
pip install -e .

2. "ModuleNotFoundError: No module named 'mlcli'"

Solution: Install in development mode:

pip install -e .

3. "FileNotFoundError: data/train.csv"

Solution: Ensure your data file exists at the specified path in the config file.

4. TensorFlow DNN Poor Performance

Solution: Neural networks need standardized features. Add StandardScaler preprocessing:

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

5. ONNX Export Errors

Solution: Install skl2onnx:

pip install skl2onnx

6. Optuna Not Found

Solution: Install optuna for Bayesian optimization:

pip install optuna

7. SHAP/LIME Not Found

Solution: Install SHAP and LIME for model explainability:

pip install shap lime matplotlib

8. SHAP TreeExplainer Error

Solution: For non-tree models, SHAP will automatically fall back to KernelExplainer. This is expected behavior.

📚 Quick Reference

Task	Command
Install mlcli	`pip install -e .`
Show help	`mlcli --help`
List models	`mlcli list-models`
List tuners	`mlcli list-tuners`
List explainers	`mlcli list-explainers`
List preprocessors	`mlcli list-preprocessors`
Train model	`mlcli train --config <config.json>`
Tune hyperparameters	`mlcli tune --config <config.json> --method random`
Tune with Bayesian	`mlcli tune -c <config> -m bayesian -n 100`
Tune and train best	`mlcli tune -c <config> -m random --train-best`
Explain model (SHAP)	`mlcli explain -m <model.pkl> -d <data.csv> -t <type> -e shap`
Explain model (LIME)	`mlcli explain -m <model.pkl> -d <data.csv> -t <type> -e lime`
Explain instance	`mlcli explain-instance -m <model.pkl> -d <data.csv> -t <type> -i <idx>`
Preprocess data	`mlcli preprocess -d <data.csv> -o <output.csv> -m standard_scaler`
Feature selection	`mlcli preprocess -d <data.csv> -o <output.csv> -m select_k_best -t label --k 10`
Preprocessing pipeline	`mlcli preprocess-pipeline -d <data.csv> -o <output.csv> -s "standard_scaler,select_k_best"`
Evaluate model	`mlcli eval --model-path <path> --data-path <path> --model-type <type>`
List runs	`mlcli list-runs`
Show run details	`mlcli show-run <run-id>`
Export runs	`mlcli export-runs --output <file.csv>`
Launch UI	`mlcli ui`

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under the MIT License.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Jan 17, 2026

0.3.0

Dec 14, 2025

This version

0.2.0

Dec 7, 2025

0.1.1

Dec 3, 2025

0.1.0

Dec 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlcli_toolkit-0.2.0.tar.gz (99.3 kB view details)

Uploaded Dec 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlcli_toolkit-0.2.0-py3-none-any.whl (116.4 kB view details)

Uploaded Dec 7, 2025 Python 3

File details

Details for the file mlcli_toolkit-0.2.0.tar.gz.

File metadata

Download URL: mlcli_toolkit-0.2.0.tar.gz
Upload date: Dec 7, 2025
Size: 99.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for mlcli_toolkit-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`67d255df4444bd7b1933703dc34935e3ad42cf4f4d2c78c13865f9aa62dd383b`
MD5	`2dccfac01335495e676a4086576790a8`
BLAKE2b-256	`f6048677f8f29db1b1b529d0f5fbb1b6ebd49ce98905802974572c280b2b9100`

See more details on using hashes here.

File details

Details for the file mlcli_toolkit-0.2.0-py3-none-any.whl.

File metadata

Download URL: mlcli_toolkit-0.2.0-py3-none-any.whl
Upload date: Dec 7, 2025
Size: 116.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for mlcli_toolkit-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2413051c6b098b944cdd5a848d93d6e66ec2aa1815790d5b50bd24debcbe58ab`
MD5	`c94556764a96544eedc260f21649744e`
BLAKE2b-256	`930b3d907847ddb841ac914b0f68b3c34ee3563c57386bb531eccb3e1263b141`

See more details on using hashes here.

mlcli-toolkit 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

🤖 MLCLI - Machine Learning Command Line Interface

🚀 Features

📁 Project Structure

🛠️ Complete Setup Guide (From Scratch)

Step 1: Clone the Repository

Step 2: Create Virtual Environment

Step 3: Install Dependencies

Step 4: Install mlcli in Development Mode

Step 5: Verify Installation

📖 All CLI Commands

1. List Available Models

2. Train Models

Train with Configuration File

Train Logistic Regression

Train Random Forest

Train SVM

Train XGBoost

Train TensorFlow DNN

Train TensorFlow CNN (for image data)

Train TensorFlow RNN (for sequence data)

Train with Parameter Overrides

3. 🆕 Hyperparameter Tuning

List Available Tuning Methods

Tune with Grid Search

Tune with Random Search

Tune with Bayesian Optimization (Optuna)

Tune and Train Best Model

Tune Options

4. 🆕 Model Explainability (SHAP/LIME)

List Available Explainers

Explain Model with SHAP

Explain Model with LIME

Explain with Plot Output

Explain Single Instance

Explainability Options

Understanding SHAP vs LIME

5. 🆕 Data Preprocessing

List Available Preprocessors

Preprocess with StandardScaler

Preprocess with MinMaxScaler

Preprocess with RobustScaler (outlier-resistant)

Normalize Data (L2 norm)

Feature Selection with SelectKBest

Feature Selection with RFE

Remove Low-Variance Features

Save Fitted Preprocessor

Apply Preprocessing Pipeline (Multiple Steps)

Preprocessing Options

Preprocessing Methods Comparison

6. Evaluate Models

Evaluate Pickle Model

Evaluate Joblib Model

Evaluate TensorFlow Model (H5)

7. Experiment Tracking Commands

List All Experiment Runs

Show Details of a Specific Run

Export All Runs to CSV

8. Interactive Terminal UI

📝 Configuration Files

Create a Configuration File

Configuration Structure

Example Configurations

Logistic Regression (configs/logistic_config.json)

Random Forest (configs/rf_config.json)

XGBoost (configs/xgb_config.json)

SVM (configs/svm_config.json)

TensorFlow DNN (configs/tf_dnn_config.json)

🔧 Hyperparameter Tuning Configuration

Grid Search Configuration

Random/Bayesian Search Configuration

Logistic Regression (`configs/logistic_config.json`)

Random Forest (`configs/rf_config.json`)

XGBoost (`configs/xgb_config.json`)

SVM (`configs/svm_config.json`)

TensorFlow DNN (`configs/tf_dnn_config.json`)