Skip to main content

Lazy Predict helps build a lot of basic models without much code and helps understand which models work better without any parameter tuning

Project description

Lazy Predict

image Publish Documentation Status Downloads CodeFactor Citations

Lazy Predict helps build a lot of basic models without much code and helps understand which models work better without any parameter tuning.

Features

  • Over 40 built-in machine learning models
  • Automatic model selection for classification, regression, and time series forecasting
  • 20+ forecasting models: statistical (ETS, ARIMA, Theta), ML (Random Forest, XGBoost, etc.), deep learning (LSTM, GRU), and pretrained foundation models (TimesFM)
  • Automatic seasonal period detection via ACF
  • Multiple categorical encoding strategies (OneHot, Ordinal, Target, Binary)
  • Built-in MLflow integration for experiment tracking
  • GPU acceleration: XGBoost, LightGBM, CatBoost, cuML (RAPIDS), LSTM/GRU, TimesFM
  • Support for Python 3.9 through 3.13
  • Custom metric evaluation support
  • Configurable timeout and cross-validation
  • Intel Extension for Scikit-learn acceleration support

Installation

pip (PyPI)

pip install lazypredict

conda (conda-forge)

conda install -c conda-forge lazypredict

Optional extras (pip only)

Install with boosting libraries (XGBoost, LightGBM, CatBoost):

pip install lazypredict[boost]

Install with time series forecasting support:

pip install lazypredict[timeseries]          # statsmodels + pmdarima
pip install lazypredict[timeseries,deeplearning]  # + LSTM/GRU via PyTorch
pip install lazypredict[timeseries,foundation]    # + Google TimesFM (Python 3.10-3.11)

Install with all optional dependencies:

pip install lazypredict[all]

Usage

To use Lazy Predict in a project:

import lazypredict

Classification

Example:

from lazypredict.Supervised import LazyClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

data = load_breast_cancer()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=123)

clf = LazyClassifier(verbose=0, ignore_warnings=True, custom_metric=None)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)

print(models)

Advanced Options

# With categorical encoding, timeout, cross-validation, and GPU
clf = LazyClassifier(
    verbose=1,                          # Show progress
    ignore_warnings=True,               # Suppress warnings
    custom_metric=None,                 # Use default metrics
    predictions=True,                   # Return predictions
    classifiers='all',                  # Use all available classifiers
    categorical_encoder='onehot',       # Encoding: 'onehot', 'ordinal', 'target', 'binary'
    timeout=60,                         # Max time per model in seconds
    cv=5,                               # Cross-validation folds (optional)
    use_gpu=True                        # Enable GPU acceleration
)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)

Parameters:

  • verbose (int): 0 for silent, 1 for progress display
  • ignore_warnings (bool): Suppress scikit-learn warnings
  • custom_metric (callable): Custom evaluation metric
  • predictions (bool): Return prediction DataFrame
  • classifiers (str/list): 'all' or list of classifier names
  • categorical_encoder (str): Encoding strategy for categorical features
    • 'onehot': One-hot encoding (default)
    • 'ordinal': Ordinal encoding
    • 'target': Target encoding (requires category-encoders)
    • 'binary': Binary encoding (requires category-encoders)
  • timeout (int): Maximum seconds per model (None for no limit)
  • cv (int): Number of cross-validation folds (None to disable)
  • use_gpu (bool): Enable GPU acceleration for supported models (default False)
Model Accuracy Balanced Accuracy ROC AUC F1 Score Time Taken
LinearSVC 0.989474 0.987544 0.987544 0.989462 0.0150008
SGDClassifier 0.989474 0.987544 0.987544 0.989462 0.0109992
MLPClassifier 0.985965 0.986904 0.986904 0.985994 0.426
Perceptron 0.985965 0.984797 0.984797 0.985965 0.0120046
LogisticRegression 0.985965 0.98269 0.98269 0.985934 0.0200036
LogisticRegressionCV 0.985965 0.98269 0.98269 0.985934 0.262997
SVC 0.982456 0.979942 0.979942 0.982437 0.0140011
CalibratedClassifierCV 0.982456 0.975728 0.975728 0.982357 0.0350015
PassiveAggressiveClassifier 0.975439 0.974448 0.974448 0.975464 0.0130005
LabelPropagation 0.975439 0.974448 0.974448 0.975464 0.0429988
LabelSpreading 0.975439 0.974448 0.974448 0.975464 0.0310006
RandomForestClassifier 0.97193 0.969594 0.969594 0.97193 0.033
GradientBoostingClassifier 0.97193 0.967486 0.967486 0.971869 0.166998
QuadraticDiscriminantAnalysis 0.964912 0.966206 0.966206 0.965052 0.0119994
HistGradientBoostingClassifier 0.968421 0.964739 0.964739 0.968387 0.682003
RidgeClassifierCV 0.97193 0.963272 0.963272 0.971736 0.0130029
RidgeClassifier 0.968421 0.960525 0.960525 0.968242 0.0119977
AdaBoostClassifier 0.961404 0.959245 0.959245 0.961444 0.204998
ExtraTreesClassifier 0.961404 0.957138 0.957138 0.961362 0.0270066
KNeighborsClassifier 0.961404 0.95503 0.95503 0.961276 0.0560005
BaggingClassifier 0.947368 0.954577 0.954577 0.947882 0.0559971
BernoulliNB 0.950877 0.951003 0.951003 0.951072 0.0169988
LinearDiscriminantAnalysis 0.961404 0.950816 0.950816 0.961089 0.0199995
GaussianNB 0.954386 0.949536 0.949536 0.954337 0.0139935
NuSVC 0.954386 0.943215 0.943215 0.954014 0.019989
DecisionTreeClassifier 0.936842 0.933693 0.933693 0.936971 0.0170023
NearestCentroid 0.947368 0.933506 0.933506 0.946801 0.0160074
ExtraTreeClassifier 0.922807 0.912168 0.912168 0.922462 0.0109999
CheckingClassifier 0.361404 0.5 0.5 0.191879 0.0170043
DummyClassifier 0.512281 0.489598 0.489598 0.518924 0.0119965

Regression

Example:

from lazypredict.Supervised import LazyRegressor
from sklearn import datasets
from sklearn.utils import shuffle
import numpy as np

diabetes  = datasets.load_diabetes()
X, y = shuffle(diabetes.data, diabetes.target, random_state=13)
X = X.astype(np.float32)

offset = int(X.shape[0] * 0.9)

X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

print(models)

Advanced Options

# With categorical encoding, timeout, and GPU
reg = LazyRegressor(
    verbose=1,                          # Show progress
    ignore_warnings=True,               # Suppress warnings
    custom_metric=None,                 # Use default metrics
    predictions=True,                   # Return predictions
    regressors='all',                   # Use all available regressors
    categorical_encoder='ordinal',      # Encoding: 'onehot', 'ordinal', 'target', 'binary'
    timeout=120,                        # Max time per model in seconds
    use_gpu=True                        # Enable GPU acceleration
)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

Parameters:

  • verbose (int): 0 for silent, 1 for progress display
  • ignore_warnings (bool): Suppress scikit-learn warnings
  • custom_metric (callable): Custom evaluation metric
  • predictions (bool): Return prediction DataFrame
  • regressors (str/list): 'all' or list of regressor names
  • categorical_encoder (str): Encoding strategy for categorical features
    • 'onehot': One-hot encoding (default)
    • 'ordinal': Ordinal encoding
    • 'target': Target encoding (requires category-encoders)
    • 'binary': Binary encoding (requires category-encoders)
  • timeout (int): Maximum seconds per model (None for no limit)
  • use_gpu (bool): Enable GPU acceleration for supported models (default False)
Model Adjusted R-Squared R-Squared RMSE Time Taken
ExtraTreesRegressor 0.378921 0.520076 54.2202 0.121466
OrthogonalMatchingPursuitCV 0.374947 0.517004 54.3934 0.0111742
Lasso 0.373483 0.515873 54.457 0.00620174
LassoLars 0.373474 0.515866 54.4575 0.0087235
LarsCV 0.3715 0.514341 54.5432 0.0160234
LassoCV 0.370413 0.513501 54.5903 0.0624897
PassiveAggressiveRegressor 0.366958 0.510831 54.7399 0.00689793
LassoLarsIC 0.364984 0.509306 54.8252 0.0108321
SGDRegressor 0.364307 0.508783 54.8544 0.0055306
RidgeCV 0.363002 0.507774 54.9107 0.00728202
Ridge 0.363002 0.507774 54.9107 0.00556874
BayesianRidge 0.362296 0.507229 54.9411 0.0122972
LassoLarsCV 0.361749 0.506806 54.9646 0.0175984
TransformedTargetRegressor 0.361749 0.506806 54.9646 0.00604773
LinearRegression 0.361749 0.506806 54.9646 0.00677514
Lars 0.358828 0.504549 55.0903 0.00935149
ElasticNetCV 0.356159 0.502486 55.2048 0.0478678
HuberRegressor 0.355251 0.501785 55.2437 0.0129263
RandomForestRegressor 0.349621 0.497434 55.4844 0.2331
AdaBoostRegressor 0.340416 0.490322 55.8757 0.0512381
LGBMRegressor 0.339239 0.489412 55.9255 0.0396187
HistGradientBoostingRegressor 0.335632 0.486625 56.0779 0.0897055
PoissonRegressor 0.323033 0.476889 56.6072 0.00953603
ElasticNet 0.301755 0.460447 57.4899 0.00604224
KNeighborsRegressor 0.299855 0.458979 57.5681 0.00757337
OrthogonalMatchingPursuit 0.292421 0.453235 57.8729 0.00709486
BaggingRegressor 0.291213 0.452301 57.9223 0.0302746
GradientBoostingRegressor 0.247009 0.418143 59.7011 0.136803
TweedieRegressor 0.244215 0.415984 59.8118 0.00633955
XGBRegressor 0.224263 0.400567 60.5961 0.339694
GammaRegressor 0.223895 0.400283 60.6105 0.0235181
RANSACRegressor 0.203535 0.38455 61.4004 0.0653253
LinearSVR 0.116707 0.317455 64.6607 0.0077076
ExtraTreeRegressor 0.00201902 0.228833 68.7304 0.00626636
NuSVR -0.0667043 0.175728 71.0575 0.0143399
SVR -0.0964128 0.152772 72.0402 0.0114729
DummyRegressor -0.297553 -0.00265478 78.3701 0.00592971
DecisionTreeRegressor -0.470263 -0.136112 83.4229 0.00749898
GaussianProcessRegressor -0.769174 -0.367089 91.5109 0.0770502
MLPRegressor -1.86772 -1.21597 116.508 0.235267
KernelRidge -5.03822 -3.6659 169.061 0.0243919

Time Series Forecasting

LazyForecaster benchmarks 20+ forecasting models on your time series in a single call:

import numpy as np
from lazypredict.TimeSeriesForecasting import LazyForecaster

# Generate sample data (or use your own)
np.random.seed(42)
t = np.arange(200)
y = 10 + 0.05 * t + 3 * np.sin(2 * np.pi * t / 12) + np.random.normal(0, 1, 200)

y_train, y_test = y[:180], y[180:]

fcst = LazyForecaster(verbose=0, ignore_warnings=True)
scores, predictions = fcst.fit(y_train, y_test)
print(scores)
Model MAE RMSE MAPE SMAPE MASE R-Squared Time Taken
Holt 0.8532 1.0285 6.3241 6.1758 0.6993 0.7218 0.03
SARIMAX 0.8791 1.0601 6.5012 6.3414 0.7205 0.7045 0.12
Ridge_TS 0.9124 1.0843 6.7523 6.5721 0.7478 0.6912 0.01
... ... ... ... ... ... ... ...

With Exogenous Variables

# Optional exogenous features
X_train = np.column_stack([np.sin(t[:180]), np.cos(t[:180])])
X_test = np.column_stack([np.sin(t[180:]), np.cos(t[180:])])

scores, predictions = fcst.fit(y_train, y_test, X_train, X_test)

Advanced Options

fcst = LazyForecaster(
    verbose=1,                          # Show progress
    ignore_warnings=True,               # Suppress model errors
    predictions=True,                   # Return forecast values
    seasonal_period=12,                 # Override auto-detection
    cv=3,                               # Time series cross-validation
    timeout=30,                         # Max seconds per model
    sort_by="RMSE",                     # Sort metric (MAE, MAPE, SMAPE, MASE, R-Squared)
    forecasters="all",                  # Or list: ["Holt", "AutoARIMA", "LSTM_TS"]
    max_models=10,                      # Limit number of models
    use_gpu=True,                       # GPU acceleration for supported models
    foundation_model_path="/path/to/timesfm-weights",  # Local model weights (offline)
)
scores, predictions = fcst.fit(y_train, y_test)

Parameters:

  • verbose (int): 0 for silent, 1 for progress display
  • ignore_warnings (bool): Suppress per-model exceptions
  • predictions (bool): Return a second DataFrame of forecasted values
  • seasonal_period (int/None): Seasonal cycle length; None auto-detects via ACF
  • cv (int/None): Number of TimeSeriesSplit folds for cross-validation
  • timeout (int/float/None): Maximum training seconds per model
  • sort_by (str): Metric to sort by ("RMSE", "MAE", "MAPE", "SMAPE", "MASE", "R-Squared")
  • forecasters (str/list): "all" or a list of model names
  • n_lags (int): Number of lag features for ML/DL models (default 10)
  • n_rolling (tuple): Rolling-window sizes for feature engineering (default (3, 7))
  • max_models (int/None): Limit total models to train
  • custom_metric (callable): Additional metric f(y_true, y_pred) -> float
  • use_gpu (bool): Enable GPU acceleration for supported models (default False)
  • foundation_model_path (str): Local path to pre-downloaded foundation model weights (e.g. TimesFM)

Available model categories:

  • Baselines: Naive, SeasonalNaive
  • Statistical (statsmodels): SimpleExpSmoothing, Holt, HoltWinters_Add, HoltWinters_Mul, Theta, SARIMAX
  • Statistical (pmdarima): AutoARIMA
  • ML (sklearn): LinearRegression_TS, Ridge_TS, Lasso_TS, ElasticNet_TS, KNeighborsRegressor_TS, DecisionTreeRegressor_TS, RandomForestRegressor_TS, GradientBoostingRegressor_TS, AdaBoostRegressor_TS, ExtraTreesRegressor_TS, BaggingRegressor_TS, SVR_TS, XGBRegressor_TS, LGBMRegressor_TS, CatBoostRegressor_TS
  • Deep Learning (torch): LSTM_TS, GRU_TS
  • Foundation (timesfm): TimesFM

GPU Acceleration

Enable GPU acceleration for supported models with use_gpu=True:

from lazypredict.Supervised import LazyClassifier, LazyRegressor

# Classification with GPU
clf = LazyClassifier(use_gpu=True, verbose=0, ignore_warnings=True)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)

# Regression with GPU
reg = LazyRegressor(use_gpu=True, verbose=0, ignore_warnings=True)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

# Time Series with GPU
from lazypredict.TimeSeriesForecasting import LazyForecaster
fcst = LazyForecaster(use_gpu=True, verbose=0, ignore_warnings=True)
scores, predictions = fcst.fit(y_train, y_test)

Supported GPU backends:

  • XGBoostdevice="cuda"
  • LightGBMdevice="gpu"
  • CatBoosttask_type="GPU"
  • cuML (RAPIDS) — GPU-native scikit-learn replacements (auto-discovered when installed)
  • LSTM / GRU — PyTorch CUDA
  • TimesFM — PyTorch CUDA

Falls back to CPU automatically if no CUDA GPU is available.

Categorical Encoding

Lazy Predict supports multiple categorical encoding strategies:

from lazypredict.Supervised import LazyClassifier
import pandas as pd
from sklearn.model_selection import train_test_split

# Example with categorical features
df = pd.read_csv('data_with_categories.csv')
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Try different encoders
for encoder in ['onehot', 'ordinal', 'target', 'binary']:
    clf = LazyClassifier(
        categorical_encoder=encoder,
        verbose=0,
        ignore_warnings=True
    )
    models, predictions = clf.fit(X_train, X_test, y_train, y_test)
    print(f"\n{encoder.upper()} Encoding Results:")
    print(models.head())

Note: Target and binary encoders require the category-encoders package:

pip install category-encoders

Intel Extension Acceleration

For improved performance on Intel CPUs, install Intel Extension for Scikit-learn:

pip install scikit-learn-intelex

Lazy Predict will automatically detect and use it for acceleration.

MLflow Integration

Lazy Predict includes built-in MLflow integration. Enable it by setting the MLflow tracking URI:

import os
os.environ['MLFLOW_TRACKING_URI'] = 'sqlite:///mlflow.db'

# MLflow tracking will be automatically enabled
reg = LazyRegressor(verbose=0, ignore_warnings=True)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

Automatically tracks:

  • Model metrics (R-squared, RMSE, etc.)
  • Training time
  • Model parameters
  • Model artifacts

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazypredict-0.3.0.tar.gz (98.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lazypredict-0.3.0-py3-none-any.whl (71.0 kB view details)

Uploaded Python 3

File details

Details for the file lazypredict-0.3.0.tar.gz.

File metadata

  • Download URL: lazypredict-0.3.0.tar.gz
  • Upload date:
  • Size: 98.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lazypredict-0.3.0.tar.gz
Algorithm Hash digest
SHA256 41466cb337d59563296ca56b262c8037add022283c79c4ebb6ce7a449b979880
MD5 3c8d152b4e3fcbfb9ece8946c8a4bf5d
BLAKE2b-256 d29faf6b02ccd845869bb01aefd962035932370e90305b978503ae5a8a2a6611

See more details on using hashes here.

Provenance

The following attestation bundles were made for lazypredict-0.3.0.tar.gz:

Publisher: publish.yml on shankarpandala/lazypredict

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lazypredict-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: lazypredict-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 71.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lazypredict-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b408d5a770ad28c0816e34c336dca5dc4dc614318ddd451a6b8829fc8e8fdf35
MD5 badf9e427b48f3948e47c250e73ab5c8
BLAKE2b-256 0790f170254565c2a6588248f08aade71aadbbceaa6f24cdea1604ffc4a947d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for lazypredict-0.3.0-py3-none-any.whl:

Publisher: publish.yml on shankarpandala/lazypredict

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page