Calculate a Machine Learning (ML) performance metric called MLcps: ML Cumulative Performance Score.
Project description
MLcps
MLcps: Machine Learning cumulative performance score is a performance metric that combines multiple performance metrics and reports a cumulative score enabling researchers to compare the ML models using a single metric. MLcps provides a comprehensive platform to identify the best-performing ML model on any given dataset.
Requirements
Note: If you want to use MLcps without installing it in your local machine, please follow Binder environment for MLcps section.
- Python >=3.8
- R >=4.0. R should be accessible through terminal/command prompt.
radarchart, tibble,anddplyrR packages. MLcps can install all these packages at first import if unavailable, but we highly recommend installing them before using MLcps. The user could run the following R code in the R environment to install them:
## Install the unavailable packages
install.packages(c('radarchart','tibble','dplyr'),dependencies = TRUE,repos="https://cloud.r-project.org")
Installation
pip install MLcps
Binder environment for MLcps
As an alternative, we have built a binder computational environment where all the requirements are pre-installed for MLcps. It allows the user to use MLcps without any installation.
To launch the example Jupyter notebook in the binder environment, please click here . It may take a while to launch!
Usage
Quick Start
#import MLcps
from MLcps import getCPS
#calculate Machine Learning cumulative performance score
cps=getCPS.calculate(object)
- object: A pandas dataframe where rows are different metrics scores and columns are different ML models. Or a GridSearchCV object.
- cps: A pandas dataframe with models name and corresponding MLcps. Or a GridSearchCV object.
Example 1
Calculate MLcps for a pandas dataframe where rows are different metrics scores and columns are different ML models.
#import MLcps
from MLcps import getCPS
#read input data (a dataframe) or load an example data
metrics=getCPS.sample_metrics()
#calculate Machine Learning cumulative performance score
cpsScore=getCPS.calculate(metrics)
#########################################################
#plot MLcps
import plotly.express as px
from plotly.offline import plot
fig = px.bar(cpsScore, x='Score', y='Algorithms',color='Score',labels={'MLcps Score'},
width=700,height=1000,text_auto=True)
fig.update_xaxes(title_text="MLcps")
plot(fig)
Example 2
Calculate MLcps using the mean test score of all the metrics available in the given GridSearch object and return an updated GridSearch object. Returned GridSearch object contains mean_test_MLcps and rank_test_MLcps arrays, which can be used to rank the models similar to any other metric.
#import MLcps
from MLcps import getCPS
#load GridSearch object or load it from package
gsObj=getCPS.sample_GridSearch_Object()
#calculate Machine Learning cumulative performance score
gsObj_updated=getCPS.calculate(gsObj)
#########################################################
#access MLcps
gsObj_updated.cv_results_["mean_test_MLcps"]
#access rank array based on MLcps
gsObj_updated.cv_results_["rank_test_MLcps"]
Example 3
Certain metrics are more significant than others in some cases. As an example, if the dataset is imbalanced, a high F1 score might be preferred to higher accuracy. A user can provide weights for metrics of interest while calculating MLcps in such a scenario. Weights should be a dictionary object where keys are metric names and values are corresponding weights. It can be passed as a parameter in getCPS.calculate() function.
- 3.a)
#import MLcps
from MLcps import getCPS
#read input data (a dataframe) or load an example data
metrics=getCPS.sample_metrics()
#define weights
weights={"Accuracy":0.75,"F1": 1.25}
#calculate Machine Learning cumulative performance score
cpsScore=getCPS.calculate(metrics,weights)
- 3.b)
#import MLcps
from MLcps import getCPS
#########################################################
#load GridSearch object or load it from package
gsObj=getCPS.sample_GridSearch_Object()
#define weights
weights={"accuracy":0.75,"f1": 1.25}
#calculate Machine Learning cumulative performance score
gsObj_updated=getCPS.calculate(gsObj,weights)
Links
- MLcps source code and a Jupyter notebook with sample analyses is available on the MLcps GitHub repository and binder
.
- Please use the MLcps GitHub repository to report all the issues.
Citations Information
If MLcps in any way help you in your research work, please cite the MLcps BiorXiV preprint or the final publication.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file MLcps-0.0.3.tar.gz.
File metadata
- Download URL: MLcps-0.0.3.tar.gz
- Upload date:
- Size: 171.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
223c1a21a10a5c7a3c057d1919cbcf943e05cab7241a4520d3719492cf938123
|
|
| MD5 |
f3e74e3e0067b881f6325e061e99d520
|
|
| BLAKE2b-256 |
7ecb32e67ad37dcf21ac9bf5ac5e79d0e907ed9763e398876edb8beebfa10369
|
File details
Details for the file MLcps-0.0.3-py3-none-any.whl.
File metadata
- Download URL: MLcps-0.0.3-py3-none-any.whl
- Upload date:
- Size: 176.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84826de53aeeb78008582b1b3ad4c3e07b185f43e977927a0448ca4f5cc07405
|
|
| MD5 |
72d3511d86d41de2c375e2ec66dd5337
|
|
| BLAKE2b-256 |
cd1ce112e76fc3314d91e3c29aa306f198870fd63bfce52a8adbfae23e899856
|