Skip to main content

SlickML: Slick Machine Learning in Python

Project description

build status License PyPI - Downloads PyPI Version Issues Forks Stars

SlickML: Slick Machine Learning in Python

SlickML is an open-source machine learning library written in Python aimed at accelerating the experimentation time for a ML application. Data Scientist tasks can often be repetitive such as feature selection, model tuning, or evaluating metrics for classification and regression problems. SlickML provides Data Scientist with a toolbox of utility functions to quickly prototype solutions for a given problem with minimal code.

Installation

First, install Python version >=3.6 from https://www.python.org, and then run:

pip install slickml

Note: in order to avoid any potential conflicts with other Python packages it's recommended to use a virtual environment, e.g. Python3 virtualenv or Conda environments for further documentation.

Quick Start

Here is an exmple using SlickML to quickly run a feature selection pipeline:

# run feature selection using loaded data
from slickml.feautre_selection import XGBoostFeatureSelector
xfs = XGBoostFeatureSelector()
xfs.fit(X, y)

selection

# plot cross-validation results
xfs.plot_cv_results()

xfscv

# plot feature frequncy after feature selection
xfs.plot_frequency()

frequency

Here is an example using SlickML how to tune hyper-params with Bayesian Optimization:

# apply BayesianOpt to tune parameters of classifier using loaded train/test data
from slickml.optimization import XGBoostClassifierBayesianOpt
xbo = XGBoostClassifierBayesianOpt()
xbo.fit(X_train, y_train)

clfbo

# best parameters
best_params = xbo.get_best_params()
best_params

{'colsample_bytree': 0.8213916662259918,
 'gamma': 1.0,
 'learning_rate': 0.23148232373451072,
 'max_depth': 4,
 'min_child_weight': 5.632602921054691,
 'reg_alpha': 1.0,
 'reg_lambda': 0.39468801734425263,
 'subsample': 1.0}

Here is an example using SlickML how to train/validate a XGBoostCV classifier:

# train a classifier using loaded train/test data and best params
from slickml.classification import XGBoostCVClassifier
clf = XGBoostCVClassifier(params=best_params)
clf.fit(X_train, y_train)
y_pred_proba = clf.predict_proba(X_test)

# plot cross-validation results
clf.plot_cv_results()

clfcv

# plot  features importance
clf.plot_feature_importance()

clfimp

# plot SHAP summary violin plot
clf.plot_shap_summary(plot_type="violin")

clfshap

# plot SHAP summary layered violin plot
clf.plot_shap_summary(plot_type="layered_violin", layered_violin_max_num_bins=5)

clfshaplv

# plot SHAP waterfall plot
clf.plot_shap_waterfall()

clfshapwf

Here is an example using SlickML how to train/validate a GLMNetCV classifier:

# train a classifier using loaded train/test data and your choice of params
from slickml.classification import GLMNetCVClassifier
clf = GLMNetCVClassifier(alpha=0.3, n_splits=4, metric="roc_auc")
clf.fit(X_train, y_train)
y_pred_proba = clf.predict_proba(X_test)

# plot cross-validation results
clf.plot_cv_results()

clfglmnetcv

# plot coefficients paths
clf.plot_coeff_path()

clfglmnetpath

Here is an example using SlickML to quickly visualize the binary classification metrics based on multiple calculated thresholds:

# plot binary metrics
from slickml.metrics import BinaryClassificationMetrics
clf_metrics = BinaryClassificationMetrics(y_test, y_pred_proba)
clf_metrics.plot()

clfmetrics

Here is an example using SlickML to quickly visualize the regression metrics:

# plot regression metrics
from slickml.metrics import RegressionMetrics
reg_metrics = RegressionMetrics(y_test, y_pred)
reg_metrics.plot()

regmetrics

Contributing

Please read the Contributing document to understand the requirements for submitting pull-requests. Note before starting any major new feature work, please open an issue describing what you are planning to work on. This will ensure that interested parties can give valuable feedback on the feature, and let others know that you are working on it. Whether the contributions consists of adding new features, optimizing code, or assisting with the documentation, we welcome new contributors of all experience levels. The SlickML community goals are to be helpful and effective.

Citing SlickML

If you use SlickML in academic work, please consider citing it.

Bibtex Entry:

@software{slickml2020,
  title={SlickML: Slick Machine Learning in Python},
  author={Tahmassebi, Amirhessam and Smith, Trace},
  url={https://github.com/slickml/slick-ml},
  version={0.1.3},
  year={2021},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slickml-0.1.5.tar.gz (55.6 kB view hashes)

Uploaded Source

Built Distribution

slickml-0.1.5-py3-none-any.whl (58.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page