Library to compare machine learning methods across datasets
Project description
A simple library to benchmark performance of machine learning methods across different datasets. mlgauge is also a wrapper around PMLB and OpenML which provide benchmark datasets for machine learning.
mlgauge can help you if
- You are developing a machine learning method or an automl system and want to compare and analyze how it performs against other methods.
- You are learning different machine learning methods and would like to understand how different methods behave under different conditions.
Checkout the documentation to learn more.
Installation
pip install mlgauge
Usage
This is the workflow for setting up and running a comparison benchmark with mlgauge:
- Set up your methods by defining a
Method
class. If your method follows the sklearn API, you can directly use theSklearnMethod
which provides a typical sklearn workflow for estimators. - Set up the experiments with the
Analysis
class. - Collect the results for further comparative analysis.
Example
from mlgauge import Analysis, SklearnMethod
from xgboost import XGBClassifier
from lightgbm import LGBMClassifier
from catboost import CatBoostClassifier
from sklearn.ensemble import GradientBoostingClassifier
import matplotlib.pyplot as plt
SEED = 42
methods = [
("xgboost", SklearnMethod(XGBClassifier(n_jobs=-1,verbose=0), ["accuracy", "f1_micro"])),
("lightgbm", SklearnMethod(LGBMClassifier(n_jobs=-1,verbose=0), ["accuracy", "f1_micro"])),
("catboost", SklearnMethod(CatBoostClassifier(thread_count=-1,verbose=0), ["accuracy", "f1_micro"])),
("gbm", SklearnMethod(GradientBoostingClassifier(verbose=0), ["accuracy", "f1_micro"])),
]
an = Analysis(
methods=methods,
metric_names=["accuracy", "f1 score"],
datasets="classification",
n_datasets=10,
random_state=SEED,
)
an.run()
print(an.get_result_as_df("f1 score"))
xgboost lightgbm catboost gbm
datasets
mfeat_morphological 0.674000 0.682000 0.698000 0.700000
labor 0.800000 0.733333 0.866667 0.800000
analcatdata_aids 0.769231 0.384615 0.538462 0.692308
mofn_3_7_10 1.000000 0.990937 1.000000 1.000000
flags 0.444444 0.377778 0.355556 0.400000
analcatdata_creditscore 1.000000 1.000000 1.000000 1.000000
mfeat_morphological 0.674000 0.682000 0.698000 0.700000
penguins 0.988095 0.976190 0.988095 0.988095
glass 0.730769 0.673077 0.692308 0.711538
iris 0.973684 0.973684 0.973684 0.973684
an.plot_results("f1 score")
More examples are available in the documentation.
Credits
Logo designed by the talented Neha Balasundaram.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mlgauge-0.3.3.tar.gz
(12.1 kB
view details)
Built Distribution
mlgauge-0.3.3-py3-none-any.whl
(12.0 kB
view details)
File details
Details for the file mlgauge-0.3.3.tar.gz
.
File metadata
- Download URL: mlgauge-0.3.3.tar.gz
- Upload date:
- Size: 12.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a7f85d0e68f196e76603955ff1e795793cce287421188a7acb44b2cfaad0fa0 |
|
MD5 | 47eba9a71d61b1d6af9e3c1b9d576707 |
|
BLAKE2b-256 | 5a738b09d3cd2ccd110debd921172315eccbd7b64c307148e32a08916162703e |
File details
Details for the file mlgauge-0.3.3-py3-none-any.whl
.
File metadata
- Download URL: mlgauge-0.3.3-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35f004a3af74c1b017f83ec79c3764bc7db844d8b9e3079871eaf7632d766349 |
|
MD5 | 352d9a91f1cd8a8900c74e30af766268 |
|
BLAKE2b-256 | b2990f551dd310c249440635f7928aece40cbe2c198e71b64b6dd510832512b9 |