Skip to main content

Library to compare machine learning methods across datasets

Project description

mlgauge

Build Formatting Code style: black Documentation Status License: MIT

A simple library to benchmark performance of machine learning methods across different datasets. mlgauge is also a wrapper around PMLB and OpenML which provide benchmark datasets for machine learning.

mlgauge can help you if

  • You are developing a machine learning method or an automl system and want to compare and analyze how it performs against other methods.
  • You are learning different machine learning methods and would like to understand how different methods behave under different conditions.

Checkout the documentation to learn more.

Installation

pip install mlgauge

Usage

This is the workflow for setting up and running a comparison benchmark with mlgauge:

  1. Set up your methods by defining a Method class. If your method follows the sklearn API, you can directly use the SklearnMethod which provides a typical sklearn workflow for estimators.
  2. Set up the experiments with the Analysis class.
  3. Collect the results for further comparative analysis.

Example

from mlgauge import Analysis, SklearnMethod
from xgboost import XGBClassifier
from lightgbm import LGBMClassifier
from catboost import CatBoostClassifier
from sklearn.ensemble import GradientBoostingClassifier
import matplotlib.pyplot as plt

SEED = 42

methods = [
    ("xgboost", SklearnMethod(XGBClassifier(n_jobs=-1,verbose=0), ["accuracy", "f1_micro"])),
    ("lightgbm", SklearnMethod(LGBMClassifier(n_jobs=-1,verbose=0), ["accuracy", "f1_micro"])),
    ("catboost", SklearnMethod(CatBoostClassifier(thread_count=-1,verbose=0), ["accuracy", "f1_micro"])),
    ("gbm", SklearnMethod(GradientBoostingClassifier(verbose=0), ["accuracy", "f1_micro"])),
]

an = Analysis(
    methods=methods,
    metric_names=["accuracy", "f1 score"],
    datasets="classification",
    n_datasets=10,
    random_state=SEED,
)
an.run()

print(an.get_result_as_df("f1 score"))
                          xgboost  lightgbm  catboost       gbm
datasets
mfeat_morphological      0.674000  0.682000  0.698000  0.700000
labor                    0.800000  0.733333  0.866667  0.800000
analcatdata_aids         0.769231  0.384615  0.538462  0.692308
mofn_3_7_10              1.000000  0.990937  1.000000  1.000000
flags                    0.444444  0.377778  0.355556  0.400000
analcatdata_creditscore  1.000000  1.000000  1.000000  1.000000
mfeat_morphological      0.674000  0.682000  0.698000  0.700000
penguins                 0.988095  0.976190  0.988095  0.988095
glass                    0.730769  0.673077  0.692308  0.711538
iris                     0.973684  0.973684  0.973684  0.973684
an.plot_results("f1 score")

boosting plot

More examples are available in the documentation.

Credits

Logo designed by the talented Neha Balasundaram.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlgauge-0.3.2.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

mlgauge-0.3.2-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file mlgauge-0.3.2.tar.gz.

File metadata

  • Download URL: mlgauge-0.3.2.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.9.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for mlgauge-0.3.2.tar.gz
Algorithm Hash digest
SHA256 6326f9dcec3f8840533930d3fe271062d77d3f1c0c1f191927d650350085ecd7
MD5 d176910a7e4555438cfc01c1a28f722c
BLAKE2b-256 a222eab61151d0b9f85c1af961c38210d649154c7f07eba1fadc244e8d4cfe9d

See more details on using hashes here.

File details

Details for the file mlgauge-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: mlgauge-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.9.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for mlgauge-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9ae6a67e0df372f3ce8f449bf8d7e1c6fb182ec8e9cb1d8fd016b38827abd620
MD5 d0905c59daa267a16ad3a9842a9720e8
BLAKE2b-256 73563355e27c9833aa1647a5ae420307e6b1b378cf022365eb7e990634046c89

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page