SmplML is a user-friendly Python module for streamlined machine learning classification and regression. It offers intuitive functionality for data preprocessing, model training, and evaluation. Ideal for beginners and experts alike, SmplML simplifies ML tasks, enabling you to gain valuable insights from your data with ease.

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 5 - Production/Stable
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- Microsoft :: Windows :: Windows 10
Programming Language
- Python :: 3

Project description

SmplML / SimpleML: Simplified Machine Learning for Classification and Regression

Features

Data preprocessing: Easily handle encoding categorical variables and data partitioning.
Model training: Train various classification and regression models with just a few lines of code.
Model evaluation: Evaluate model performance using common metrics.
This module is designed to seamlessly handle various scikit-learn models, making it flexible for handling sklearn-like model formats.
Added training feature for training multiple models for experimentation.

Installation

You can install SmpML using pip:

pip install SimpleML

Usage

The TrainModel class is designed to handle both classification and regression tasks. It determines the task type based on the target parameter. If the target has a float data type, the class automatically redirects the procedures to regression; otherwise, it assumes a classification task.

Data Preparation

Data preparation like data spliting and converting categorical data into numerical data is also automatically executed when calling the fit() method.

import seaborn as sns
import pandas as pd
from smpl_ml.smpl_ml import TrainModel

Classification Task

from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier

df = sns.load_dataset('penguins')
df.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex
0	Adelie	Torgersen	39.1	18.7	181.0	3750.0	Male
1	Adelie	Torgersen	39.5	17.4	186.0	3800.0	Female
2	Adelie	Torgersen	40.3	18.0	195.0	3250.0	Female
3	Adelie	Torgersen	NaN	NaN	NaN	NaN	NaN
4	Adelie	Torgersen	36.7	19.3	193.0	3450.0	Female

clf_target = 'sex'
clf_features = df.iloc[:, df.columns != clf_target].columns

print(f"Class: {clf_target}")
print(f"Features: {clf_features}")

Class: sex
Features: Index(['species', 'island', 'bill_length_mm', 'bill_depth_mm',
       'flipper_length_mm', 'body_mass_g'],
      dtype='object')

Single Classification Model Training

# Initialize TrainModel object
clf_trainer = TrainModel(df.dropna(), target=clf_target, features=clf_features, models=LogisticRegression(C=0.01, max_iter=10_000))

# Fit the object
clf_trainer.fit()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Recall	Specificity	Precision	F1-Score	Accuracy
Male	0.85	0.82	0.83	0.84	0.84
Female	0.82	0.85	0.84	0.83	0.84

The displayed dataframe when calling the fit() method contains the training results, this output can be suppressed by setting verbose=False.

# Evaluate the model
clf_trainer.evaluate()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Recall	Specificity	Precision	F1-Score	Accuracy
Male	0.73	0.86	0.83	0.78	0.8
Female	0.86	0.73	0.77	0.81	0.8

The displayed dataframe when calling the evaluate() method contains the testing results, this output can be suppressed by setting verbose=False.

# Access the fitted model
clf_trainer.fitted_models_dict

{'LogisticRegression': LogisticRegression(C=0.01, max_iter=10000)}

Multiple Classification Model Training

# Initialize TrainModel object
clfs = [LogisticRegression(), DecisionTreeClassifier(), RandomForestClassifier(), SVC(), KNeighborsClassifier()]

clf_trainer = TrainModel(df.dropna(), target=clf_target, features=clf_features, models=clfs, test_size=0.4)

# Fit the object
clf_trainer.fit(verbose=False)

# Evaluate the model
clf_trainer.evaluate(verbose=True)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Recall	Specificity	Precision	F1-Score	Accuracy
Male	0.76	0.81	0.82	0.79	0.78
Female	0.81	0.76	0.75	0.78	0.78

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Recall	Specificity	Precision	F1-Score	Accuracy
Male	0.86	0.83	0.85	0.85	0.84
Female	0.83	0.86	0.84	0.83	0.84

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Recall	Specificity	Precision	F1-Score	Accuracy
Male	0.84	0.86	0.87	0.85	0.85
Female	0.86	0.84	0.83	0.84	0.85

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Recall	Specificity	Precision	F1-Score	Accuracy
Male	0.49	0.73	0.67	0.57	0.6
Female	0.73	0.49	0.56	0.63	0.6

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Recall	Specificity	Precision	F1-Score	Accuracy
Male	0.74	0.78	0.79	0.76	0.76
Female	0.78	0.74	0.73	0.75	0.76

Results

clf_trainer.results_df

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Model	Accuracy
0	RandomForestClassifier	0.85
1	DecisionTreeClassifier	0.84
2	LogisticRegression	0.78
3	KNeighborsClassifier	0.76
4	SVC	0.60

clf_trainer.fitted_models_dict

{'LogisticRegression': LogisticRegression(),
 'DecisionTreeClassifier': DecisionTreeClassifier(),
 'RandomForestClassifier': RandomForestClassifier(),
 'SVC': SVC(),
 'KNeighborsClassifier': KNeighborsClassifier()}

Accuracy results and the fitted models can be accessed through the results_df and fitted_models_dict attributes.

Regression Task

from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.svm import SVR
from sklearn.ensemble import GradientBoostingRegressor

df = sns.load_dataset('penguins')
df.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex
0	Adelie	Torgersen	39.1	18.7	181.0	3750.0	Male
1	Adelie	Torgersen	39.5	17.4	186.0	3800.0	Female
2	Adelie	Torgersen	40.3	18.0	195.0	3250.0	Female
3	Adelie	Torgersen	NaN	NaN	NaN	NaN	NaN
4	Adelie	Torgersen	36.7	19.3	193.0	3450.0	Female

reg_target = 'bill_length_mm'
reg_features = df.iloc[:, df.columns != reg_target].columns

print(f"Class: {reg_target}")
print(f"Features: {reg_features}")

Class: bill_length_mm
Features: Index(['species', 'island', 'bill_depth_mm', 'flipper_length_mm',
       'body_mass_g', 'sex'],
      dtype='object')

Single Regression Model Training

# Initialize TrainModel object
reg_trainer = TrainModel(df.dropna(), 
                         target=reg_target, 
                         features=reg_features,
                         models=LinearRegression())

# Fit the object
reg_trainer.fit(verbose=False)

# Evaluate the model
reg_trainer.evaluate()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	MSE	RMSE	MAE	R-squared
Metrics	6.3	2.51	1.91	0.81

# Access the model
reg_trainer.fitted_models_dict

{'LinearRegression': LinearRegression()}

Multiple Regression Model Training

# Initialize TrainModel object
regs = [LinearRegression(), DecisionTreeRegressor(), RandomForestRegressor(), SVR(), GradientBoostingRegressor()]

reg_trainer = TrainModel(df.dropna(), target=reg_target, features=reg_features, models=regs, test_size=0.4)

# Fit the object
reg_trainer.fit(verbose=False)

# Evaluate the model
reg_trainer.evaluate(verbose=False)

Results

reg_trainer.results_df

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Model	MSE	RMSE	MAE	R-squared
0	RandomForestRegressor	5.74	2.40	1.87	0.81
1	GradientBoostingRegressor	6.58	2.57	1.94	0.79
2	DecisionTreeRegressor	6.98	2.64	2.06	0.77
3	LinearRegression	7.63	2.76	2.11	0.75
4	SVR	21.51	4.64	3.63	0.31

reg_trainer.fitted_models_dict

{'LinearRegression': LinearRegression(),
 'DecisionTreeRegressor': DecisionTreeRegressor(),
 'RandomForestRegressor': RandomForestRegressor(),
 'SVR': SVR(),
 'GradientBoostingRegressor': GradientBoostingRegressor()}

Change Log

1.0.6 (06/13/2023)

Added regression
Modified docstrings
Added pre-defined function
Fixed local issues
Added training feature for training multiple models.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 5 - Production/Stable
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- Microsoft :: Windows :: Windows 10
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.0.6

Jun 13, 2023

1.0.5

Jun 12, 2023

1.0.4

Jun 12, 2023

1.0.3

Jun 12, 2023

1.0.2

Jun 12, 2023

1.0.1

Jun 12, 2023

1.0.0

Jun 12, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SmplML-1.0.6.tar.gz (8.7 kB view details)

Uploaded Jun 13, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

SmplML-1.0.6-py3-none-any.whl (10.2 kB view details)

Uploaded Jun 13, 2023 Python 3

File details

Details for the file SmplML-1.0.6.tar.gz.

File metadata

Download URL: SmplML-1.0.6.tar.gz
Upload date: Jun 13, 2023
Size: 8.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.0

File hashes

Hashes for SmplML-1.0.6.tar.gz
Algorithm	Hash digest
SHA256	`c8a8e5909026cce70e7680aaacf81cbc270c0f3e576b344fbe114e49904d5ef7`
MD5	`9c51f2393997b2d016b46288bbd92a0e`
BLAKE2b-256	`42e6c5a3a776288194371bd91d09b794eb05e8dc012e575daca6c8a45241950c`

See more details on using hashes here.

File details

Details for the file SmplML-1.0.6-py3-none-any.whl.

File metadata

Download URL: SmplML-1.0.6-py3-none-any.whl
Upload date: Jun 13, 2023
Size: 10.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.0

File hashes

Hashes for SmplML-1.0.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e89b005fd2511f511d2c56a14298e29b0d78294f32d852c6f0935c5fbe92bf16`
MD5	`76f66558f702360050315ecaf1cf8a6c`
BLAKE2b-256	`a4367df0bbf240cb6fcecc9be142deed48205797da2817206e08b75f7ffaf005`

See more details on using hashes here.

SmplML 1.0.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SmplML / SimpleML: Simplified Machine Learning for Classification and Regression

Features

Installation

Usage

Data Preparation

Classification Task

Single Classification Model Training

Multiple Classification Model Training

Results

Regression Task

Single Regression Model Training

Multiple Regression Model Training

Results

Change Log

1.0.6 (06/13/2023)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes