Automated Machine Learning for Supervised tasks

These details have not been verified by PyPI

Project links

Homepage

Project description

mljar-supervised

The new standard in Machine Learning!

Thanks to Automated Machine Learning you don't need to worry about different machine learning interfaces. You don't need to know all algorithms and their hyper-parameters. With AutoML model tuning and training is painless.

In the current version only binary classification is supported with optimization of LogLoss metric.

Quick example

import pandas as pd
from supervised.automl import AutoML

df = pd.read_csv("https://raw.githubusercontent.com/pplonski/datasets-for-start/master/adult/data.csv", skipinitialspace=True)

X = df[df.columns[:-1]]
y = df["income"]

automl = AutoML()
automl.fit(X, y)

predictions = automl.predict(X)

The tuning algorithm

The tuning algorithm was created and developed by Piotr Płoński. It is heuristic algorithm created from combination of:

not-so-random approach
and hill-climbing

The approach is not-so-random because each algorithm has a defined set of hyper-parameters that usually works. At first step from not so random parameters an initial set of models is drawn. Then the hill climbing approach is used to pick best performing algorithms and tune them.

For each algorithm used in the AutoML the early stopping is applied.

The ensemble algorithm was implemented based on Caruana paper.

Installation

From PyPi repository:

pip install mljar-supervised

From source code:

git clone https://github.com/mljar/mljar-supervised.git
cd mljar-supervised
python setup.py install

Python 3.6 is required.

Usage

This is Automated Machine Learning package, so all hard tasks is done for you. The interface is simple but if necessary it gives you ability to control the training process.

Train and predict

automl = AutoML()
automl.fit(X, y)
predictions = automl.predict(X)

By the default, the training should finish in less than 1 hour and as ML algorithms will be checked:

Random Forest
Xgboost
CatBoost
LightGBM
Neural Network
Ensemble

The parameters that you can use to control the training process are:

total_time_limit - it is a total time limit that AutoML can spend for searching to the best ML model. It is in seconds. Default is set to 3600 seconds.
learner_time_limit - the time limit for training single model, in case of k-fold cross validation, the time spend on training is k*learner_time_limit. This parameter is only considered when total_time_limit is set to None. Default is set to 120 seconds.
algorithms - the list of algorithms that will be checked. Default is set to ["CatBoost", "Xgboost", "RF", "LightGBM", "NN"].
start_random_models - the number of models to check with not so random algorithm. Default is set to 10.
hill_climbing_steps - number of hill climbing steps used in models tuning. Default is set to 3.
top_models_to_improve - number of models considered for improvement in each hill climbing step. Default is set to 5.
train_ensemble - decides if ensemble model is trained at the end of AutoML fit procedure. Default is set to True.
verbose - controls printouts, Default is set to True.

Development

Installation

git clone https://github.com/mljar/mljar-supervised.git
virtualenv venv --python=python3.6
source venv/bin/activate
pip install -r requirements.txt

Testing

cd supervised
python -m tests.run_all

Don't miss updates and news from us. Subscribe to newsletter!

Roadmap

The package is under active development! Please expect a lot of changes! For this package the graphical interface will be provided soon (also open source!). Please be tuned.

To be added:

training single decision tree
create text report from trained models (maybe with plots from learning)
compute threshold for model prediction and predicting discrete output (label)
add model/predictions explanations
add support for multiclass classification
add support for regressions

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.1.18

Jul 7, 2025

1.1.17

Apr 1, 2025

1.1.16

Mar 25, 2025

1.1.15

Jan 14, 2025

1.1.14

Nov 12, 2024

1.1.13

Nov 8, 2024

1.1.12

Oct 9, 2024

1.1.11

Sep 10, 2024

1.1.10

Sep 10, 2024

1.1.9

Jun 3, 2024

1.1.8

May 22, 2024

1.1.7

Apr 10, 2024

1.1.6

Mar 8, 2024

1.1.5

Mar 4, 2024

1.1.4

Mar 4, 2024

1.1.3

Jan 22, 2024

1.1.2

Jan 8, 2024

1.1.1

Sep 26, 2023

1.1.0

Sep 21, 2023

1.0.2

Jul 6, 2023

1.0.1

Jul 6, 2023

1.0.0

Jun 26, 2023

0.11.5

Dec 30, 2022

0.11.4

Dec 14, 2022

0.11.3

Aug 16, 2022

0.11.2

Mar 2, 2022

0.11.1

Oct 1, 2021

0.11.0

Sep 6, 2021

0.10.6

Jun 8, 2021

0.10.5

Jun 8, 2021

0.10.4

May 14, 2021

0.10.3

Apr 1, 2021

0.10.2

Mar 17, 2021

0.10.1

Mar 16, 2021

0.10.0

Mar 16, 2021

0.9.1

Mar 2, 2021

0.9.0

Feb 27, 2021

0.8.9

Feb 5, 2021

0.8.8

Jan 30, 2021

0.8.7

Jan 29, 2021

0.8.6

Jan 29, 2021

0.8.5

Jan 29, 2021

0.8.4

Jan 29, 2021

0.8.3

Jan 27, 2021

0.8.2

Jan 27, 2021

0.8.1

Jan 25, 2021

0.8.0

Jan 22, 2021

0.7.20

Jan 14, 2021

0.7.19

Jan 12, 2021

0.7.18

Jan 11, 2021

0.7.17

Jan 11, 2021

0.7.16

Jan 10, 2021

0.7.15

Dec 17, 2020

0.7.14

Dec 16, 2020

0.7.13

Dec 11, 2020

0.7.12

Dec 8, 2020

0.7.11

Dec 3, 2020

0.7.10

Dec 1, 2020

0.7.9

Nov 30, 2020

0.7.8

Nov 27, 2020

0.7.7

Nov 26, 2020

0.7.6

Nov 24, 2020

0.7.5

Nov 23, 2020

0.7.4

Nov 23, 2020

0.7.3

Sep 21, 2020

0.7.2

Sep 15, 2020

0.7.1

Sep 9, 2020

0.7.0

Sep 9, 2020

0.6.1

Aug 28, 2020

0.6.0

Jul 31, 2020

0.5.5

Jul 22, 2020

0.5.4

Jul 21, 2020

0.5.3

Jul 14, 2020

0.5.2

Jul 10, 2020

0.5.1

Jul 9, 2020

0.5.0

Jul 9, 2020

0.4.1

Jul 2, 2020

0.4.0

Jul 2, 2020

0.3.5

May 12, 2020

0.3.4

May 6, 2020

0.3.3

May 6, 2020

0.3.2

May 6, 2020

0.3.1

May 5, 2020

0.3.0

May 5, 2020

0.2.8

Apr 22, 2020

0.2.7

Apr 22, 2020

0.2.6

Apr 21, 2020

0.2.5

Apr 20, 2020

0.2.4

Apr 18, 2020

0.2.3

Apr 17, 2020

0.2.2

Apr 17, 2020

0.2.1

Apr 17, 2020

0.2.0

Apr 16, 2020

0.1.7

Apr 25, 2019

0.1.6

Apr 24, 2019

0.1.5

Apr 23, 2019

This version

0.1.4

Apr 23, 2019

0.1.3

Apr 23, 2019

0.1.2

Apr 13, 2019

0.1.1

Apr 9, 2019

0.1.0

Apr 9, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mljar-supervised-0.1.4.tar.gz (25.4 kB view details)

Uploaded Apr 23, 2019 Source

File details

Details for the file mljar-supervised-0.1.4.tar.gz.

File metadata

Download URL: mljar-supervised-0.1.4.tar.gz
Upload date: Apr 23, 2019
Size: 25.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for mljar-supervised-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`b6e3ed6827fdc65ee077af3e015ac4382aaaf31751f34f2ff234cf8902268ffa`
MD5	`1e585b97130245ea7da220f508403ff3`
BLAKE2b-256	`60ae781a53e7dcd5aeea6445c352d5bffc64e95e9030d9d358b77a4ca2170533`

See more details on using hashes here.

mljar-supervised 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mljar-supervised

The new standard in Machine Learning!

Quick example

The tuning algorithm

Installation

Usage

Train and predict

Development

Installation

Testing

Newsletter

Roadmap

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes