Skip to main content

ModelSelector automates ensemble creation, added AutoTuner.

Project description

ModelSelector

The ModelSelector class is a comprehensive tool designed to simplify the model selection process in machine learning. It automates the creation of an ensemble pipeline containing a selected number of models, optimizing hyperparameters for optimal prediction scores.

Input Parameters

When initializing the ModelSelector class, you can provide the following input parameters:

  • data (pandas DataFrame): The input dataset to be used for model selection.

  • target (str): The name of the target column in the dataset.

  • eda_pipe (sklearn Pipeline, optional): An optional pipeline for Exploratory Data Analysis (EDA). If provided, EDA will be performed on the dataset before model selection. Defaults to None.

  • task (str, optional): The task to be performed, which can be either 'classification' or 'regression'. Defaults to 'classification'.

  • i (int, optional): The number of best models to be selected. You can specify the number of top-performing models to include in the final ensemble. The maximum value is 6. Be aware that selecting more models increases the computation time. Defaults to 2.

  • precision (float, optional): A precision value ranging from 0.1 to 1. A higher precision value results in more precise model selection but requires more time to run. It determines the granularity of model evaluation. Defaults to 0.2.

Getting the Final Pipeline

After initializing the ModelSelector class with your desired parameters, you can obtain the final machine learning pipeline by calling the get_pipeline() method. This pipeline will consist of the selected ensemble of models, fine-tuned with the best hyperparameters.

Here's an example of how to use the ModelSelector class:

# Import the ModelSelector class
from model_selector import ModelSelector

# Initialize the ModelSelector with your data and target column
selector = ModelSelector(data=my_data, target='target_column')

# Get the final machine learning pipeline
final_pipeline = selector.get_pipeline()

Features

  • Automated Ensemble Creation: The ModelSelector class automatically generates an ensemble pipeline with a specified number of models, each contributing to the final predictions.
  • Hyperparameter Optimization: Utilizes a combination of model selection and hyperparameter tuning to output the best-performing models and their corresponding hyperparameters.
  • Versatile Usage: Offers both ensemble creation (start()).
  • Scoring: Run the (evaluate()) function after ensemble creation to get your scores.
  • Supports Classification and Regression: Adaptable for both classification and regression tasks, providing flexibility in application.
  • Easy Retrieval of Best Pipeline: Use the get_pipeline() function to retrieve the optimized pipeline with the best-performing models.
  • AutoTuner Class Gives the option to fine-tune an existing pipeline with a specific model (auto_tune()), given a specific data. Takes X and Y (could be X_train and y_train).

Installation

You can install the ModelSelector class using pip:

pip install yctmodel

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yctmodel-5.4.0.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yctmodel-5.4.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file yctmodel-5.4.0.tar.gz.

File metadata

  • Download URL: yctmodel-5.4.0.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for yctmodel-5.4.0.tar.gz
Algorithm Hash digest
SHA256 2135ecf7911d360c14412c97d98cd63a48f01a9e1ee0cbf02ae021f9f3488a33
MD5 bf7e9bc6ff4b48f2b3aaad8246643848
BLAKE2b-256 63958a715680a22ee7a92cc7697084994971ee42d2ab54ff8116b953b5cbb08e

See more details on using hashes here.

File details

Details for the file yctmodel-5.4.0-py3-none-any.whl.

File metadata

  • Download URL: yctmodel-5.4.0-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for yctmodel-5.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3318b594a0ffe308233d803dbfba5c7ac4e2a38368dfd4ea8d995dd619c337fe
MD5 7e54e281c03d183c068986417f63d8ef
BLAKE2b-256 8a086b75df6aa0c5caa06ddbe5cd29cc8dfe56da737a4c1bf9061d7c743e56d6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page