This package streamlines machine learning model training, evaluation, and customization with support for regression and classification tasks, automated workflow, diverse model integration, EDA, serialization, and parameter tuning.

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

fluid_domain.art import KMEngine

class KMEngine(data, column, test_ratio=0.2)

KMEngine (Karthik's Model Engine) is a versatile tool designed to streamline machine learning model training, evaluation, and customization. With support for both regression and classification tasks, Kmodel_engine offers an automated workflow that includes data preprocessing, feature selection, exploratory data analysis (EDA), model training, and metric evaluation. This class integrates an array of popular machine learning models from libraries like Scikit-Learn, XGBoost, and LightGBM. Additionally, it facilitates model parameter customization, empowering users to efficiently build, assess, and refine machine learning models tailored to their specific datasets and tasks. As of now this class cannot remove outliers, to remove them use O_sieve or IsolationForest

Features

Supports both regression and classification tasks
Integrates a wide range of machine learning models from popular libraries such as Scikit-Learn, XGBoost, LightGBM, and more
Performs automated data preprocessing including missing value imputation, scaling, and one-hot encoding
Allows for exploratory data analysis (EDA) through YData Profiling
Provides various evaluation metrics including accuracy, F1-score, precision, recall, R-squared, adjusted R-squared, mean absolute error (MAE), and root mean squared error (RMSE)
Supports custom parameter tuning for models

Parameters:

data: dataframe
- The data on which the evalution of models should occur.
column: str
- Target column, the dependent variable.
test_ratio: float, default=0.2
- The conventional training and testing split ratio.

Installation

pip install KMEngine

Usage

EDA (EXtrapolatory Data Analysis)

import pandas as pd
from fluid_domain.art import KMEngine
# Reading a dataset using pandas.
df=pd.read_csv('tested.csv')
engine=KMEngine(df,'Survived')
eda=engine.EDA()
# EDA done. Check your working directory for the html file. 
# Produces an html file with the name 'KME_data_report.html' in the working directory.

Classification:

import pandas as pd
from fluid_domain.art import KMEngine
# Reading a dataset using pandas.
df=pd.read_csv('tested.csv')
engine=KMEngine(df,'Survived')
result=engine.super_learning()
print(result)

# Engine Summoned.
# Loaded Data Successfully with 418 rows and 12 columns.
# Building models and Training them. This might take a while...
# Engine Encountered Discrete Data. Hence, proceeding with Classification

# Writing models to respective keys: ['Logistic Regression', 'Random Forest Classifier', 'Decision Tree Classifier', 'Xtreme Gradient Boosting Classifier', 'Stochastic Gradient Descent Classifier', 'Gradient Boosting Classifier', 'Adaptive Boost Classifier', 'Light Gradient Boosting Classifier', 'Extra Trees Classifier', 'Support Vector Classification', 'K Nearest Neighbors Classifier', 'Ridge Classifier', 'MLP Classifier', 'Quadratic Discriminant Analysis', 'Linear Discriminant Analysis', 'Naive Bayes Classifier']

# Currently Running : Logistic Regression
# Currently Running : Random Forest Classifier
# Currently Running : Decision Tree Classifier
# Currently Running : Xtreme Gradient Boosting Classifier
# Currently Running : Stochastic Gradient Descent Classifier
# Currently Running : Gradient Boosting Classifier
# Currently Running : Adaptive Boost Classifier
# Currently Running : Light Gradient Boosting Classifier
# Currently Running : Extra Trees Classifier
# Currently Running : Support Vector Classification
# Currently Running : K Nearest Neighbors Classifier
# Currently Running : Ridge Classifier
# Currently Running : MLP Classifier
# Currently Running : Quadratic Discriminant Analysis
# Currently Running : Linear Discriminant Analysis
# Currently Running : Naive Bayes Classifier

# All models evaluations:
# +----------------------------------------+----------+----------+-----------+--------+---------+
# |                 Model                  | Accuracy | F1-Score | Precision | Recall | ROC AUC |
# +----------------------------------------+----------+----------+-----------+--------+---------+
# |          Logistic Regression           |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |        Random Forest Classifier        |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |        Decision Tree Classifier        |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |  Xtreme Gradient Boosting Classifier   |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# | Stochastic Gradient Descent Classifier |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |      Gradient Boosting Classifier      |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |       Adaptive Boost Classifier        |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |   Light Gradient Boosting Classifier   |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |         Extra Trees Classifier         |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |     Support Vector Classification      |   0.98   |   0.97   |    0.97   |  0.97  |   0.97  |
# |     K Nearest Neighbors Classifier     |   0.99   |   0.98   |    0.97   |  1.0   |   0.99  |
# |            Ridge Classifier            |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |             MLP Classifier             |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |    Quadratic Discriminant Analysis     |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# |      Linear Discriminant Analysis      |   0.63   |   0.37   |    0.45   |  0.31  |   0.56  |
# |         Naive Bayes Classifier         |   1.0    |   1.0    |    1.0    |  1.0   |   1.0   |
# +----------------------------------------+----------+----------+-----------+--------+---------+
# Time Eaten :3.7687642574310303 secs

Regression

import pandas as pd
from fluid_domain.art import KMEngine
# Reading a dataset using pandas.
df=pd.read_csv('co2.csv')
engine=KMEngine(df,'CO2 Emissions(g/km)')
result=engine.super_learning()
print(result)

# Engine Summoned.
# Loaded Data Successfully with 7385 rows and 12 columns.
# Building models and Training them. This might take a while...
# Engine Encountered Continuous Data. Hence, proceeding with Regression

# Writing models to respective keys: ['Linear Regression', 'Random Forest Regression', 'Light Gradient Boosting Regressor', 'Xtreme Gradient Boosting', 'Decison Tree Regressor', 'Gradient Boosting Regressor', 'Adaptive Boosting Regressor', 'Stochastic Gradient Descent Regressor', 'Support Vector Regression', 'Extra Trees Regressor', 'Ridge Regression', 'Gamma Regressor', 'Huber Regressor', 'Poisson Regressor', 'Lasso Regressor', 'Elastic Net Regressor', 'K Nearest Neighbors Regressor', 'MLP Regressor']

# Currently Running : Linear Regression
# Currently Running : Random Forest Regression
# Currently Running : Light Gradient Boosting Regressor
# Currently Running : Xtreme Gradient Boosting
# Currently Running : Decison Tree Regressor
# Currently Running : Gradient Boosting Regressor
# Currently Running : Adaptive Boosting Regressor
# Currently Running : Stochastic Gradient Descent Regressor
# Currently Running : Support Vector Regression
# Currently Running : Extra Trees Regressor
# Currently Running : Ridge Regression
# Currently Running : Gamma Regressor
# Currently Running : Huber Regressor
# Currently Running : Poisson Regressor
# Currently Running : Lasso Regressor
# Currently Running : Elastic Net Regressor
# Currently Running : K Nearest Neighbors Regressor
# Currently Running : MLP Regressor

# All models evaluations:
# +---------------------------------------+----------------+-------------------+------------+------------+
# |                 Model                 |    R2 Score    | Adjusted R2 Score |    MAE     |    RMSE    |
# +---------------------------------------+----------------+-------------------+------------+------------+
# |           Linear Regression           |      0.89      |        0.89       |   12.25    |   19.71    |
# |        Random Forest Regression       |      0.99      |        0.99       |    2.57    |    6.3     |
# |   Light Gradient Boosting Regressor   |      0.97      |        0.97       |    4.46    |   10.14    |
# |        Xtreme Gradient Boosting       |      0.99      |        0.99       |    2.91    |    6.95    |
# |         Decison Tree Regressor        |      0.98      |        0.98       |    2.48    |    8.39    |
# |      Gradient Boosting Regressor      |      0.97      |        0.97       |    5.53    |   10.53    |
# |      Adaptive Boosting Regressor      |      0.88      |        0.88       |   15.38    |   20.93    |
# | Stochastic Gradient Descent Regressor | -1490880552.75 |   -1493104842.2   | 1606127.27 | 2247205.34 |
# |       Support Vector Regression       |      0.89      |        0.89       |    9.52    |   19.32    |
# |         Extra Trees Regressor         |      0.99      |        0.99       |    2.16    |    6.18    |
# |            Ridge Regression           |      0.89      |        0.89       |   11.74    |   18.88    |
# |            Gamma Regressor            |      0.89      |        0.89       |   12.05    |   20.65    |
# |            Huber Regressor            |      0.82      |        0.82       |    8.17    |   25.38    |
# |           Poisson Regressor           |      0.92      |        0.92       |   10.35    |    17.1    |
# |            Lasso Regressor            |      0.9       |        0.9        |   12.04    |   18.94    |
# |         Elastic Net Regressor         |      0.89      |        0.89       |   11.73    |   19.41    |
# |     K Nearest Neighbors Regressor     |      0.98      |        0.98       |    3.49    |    8.15    |
# |             MLP Regressor             |      0.91      |        0.91       |    9.01    |   17.89    |
# +---------------------------------------+----------------+-------------------+------------+------------+
# Time Eaten :23.698055028915405 secs

Setting custom parameters

import pandas as pd
from fluid_domain.art import KMEngine
# Reading a dataset using pandas.
df=pd.read_csv('tested.csv')
engine=KMEngine(df,'Survived')
custom_model = engine.set_custom_params('Random Forest Classifier', n_estimators=500, max_depth=10, random_state=666)
print(custom_model)

# Engine Summoned.
# Loaded Data Successfully with 418 rows and 12 columns.
# Custom parameters applied for Random Forest Classifier:
# {'bootstrap': True, 'ccp_alpha': 0.0, 'class_weight': None, 'criterion': 'gini', 'max_depth': 10, 'max_features': 'sqrt', 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 500, 'n_jobs': None, 'oob_score': False, 'random_state': 666, 'verbose': 0, 'warm_start': False}
# Currently Running : Random Forest Classifier
# Classification Metrics for RandomForestClassifier(max_depth=10, n_estimators=500, random_state=666)
# Accuracy: 1.0
# F1_Score: 1.0
# Precision: 1.0
# Recall: 1.0
# ROC AUC: 1.0

Model saving (General Model)

import pandas as pd
from fluid_domain.art import KMEngine
# Reading a dataset using pandas.
df=pd.read_csv('tested.csv')
engine=KMEngine(df,'Survived')
engine.model_save('Random Forest Classifier')

# This will save the specified model as pickle file in your wroking directory.

Model saving (Custom Model)

The specified custom model will overwrite the existing model in the model dictionary, for space conservation purposes.

import pandas as pd
from fluid_domain.art import KMEngine
# Reading a dataset using pandas.
df=pd.read_csv('tested.csv')
engine=KMEngine(df,'Survived')
custom_model = engine.set_custom_params('Random Forest Classifier', n_estimators=500, max_depth=10, random_state=666)
print(custom_model)
engine.model_save('Random Forest Classifier')

# This will save the updated specified model as pickle file in your wroking directory.

Resuing the saved model.

import pandas as pd
import pickle
saved_model=pickle.load(open('Random Forest Regression.pkl','rb'))
new_data = {
    'Make': ['ACURA','ACURA'],
    'Model': ['MDX 4WD','ILX'],
    'Vehicle Class': ['SUV - SMALL','COMPACT'],
    'Engine Size(L)': [3.5,5],
    'Cylinders': [6,12],
    'Transmission': ['AS6','AM7'],
    'Fuel Type': ['Z','D'],
    'Fuel Consumption City (L/100 km)': [11.2,13.5],
    'Fuel Consumption Hwy (L/100 km)': [10.0, 15.9],
    'Fuel Consumption Comb (L/100 km)': [25.36, 35.8],
    'Fuel Consumption Comb (mpg)': [40, 60]
}

# Convert the dictionary to a pandas DataFrame
new_data_df = pd.DataFrame(new_data)

# Use the saved model to make predictions on the new data
ypred = saved_model.predict(new_data_df)

print(ypred)

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.1.2

Feb 7, 2024

1.0.0

Aug 15, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

KMEngine-1.1.2.tar.gz (9.1 kB view details)

Uploaded Feb 7, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

KMEngine-1.1.2-py3-none-any.whl (9.3 kB view details)

Uploaded Feb 7, 2024 Python 3

File details

Details for the file KMEngine-1.1.2.tar.gz.

File metadata

Download URL: KMEngine-1.1.2.tar.gz
Upload date: Feb 7, 2024
Size: 9.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.66.1 CPython/3.10.0

File hashes

Hashes for KMEngine-1.1.2.tar.gz
Algorithm	Hash digest
SHA256	`032a4da7609897e3f3c840208aa08f7aa8963be330017628a56c028ab054027b`
MD5	`bbe52b56506dde73e6a4d631720f2586`
BLAKE2b-256	`e3343169d02611cc916000e957038a5bef9e97e25cfcd81798ecb8077fd5827f`

See more details on using hashes here.

File details

Details for the file KMEngine-1.1.2-py3-none-any.whl.

File metadata

Download URL: KMEngine-1.1.2-py3-none-any.whl
Upload date: Feb 7, 2024
Size: 9.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.66.1 CPython/3.10.0

File hashes

Hashes for KMEngine-1.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`89c373d8a60ee21615b48cd250b4b1ae10c240d806aa2f6c9599aeefd196baa9`
MD5	`f98a3889e65d2a98b550d821f4930a81`
BLAKE2b-256	`0d468b3d3b3bba3c3bcd2cf7c5af695b2362563d6ce58eecccc1f401a692646a`

See more details on using hashes here.

KMEngine 1.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

fluid_domain.art import KMEngine

class KMEngine(data, column, test_ratio=0.2)

Features

Parameters:

Installation

Usage

EDA (EXtrapolatory Data Analysis)

Classification:

Regression

Setting custom parameters

Model saving (General Model)

Model saving (Custom Model)

Resuing the saved model.

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes