constrained optimization for gradient boosting models with non-decomposable constraints
Project description
Constrained Gradient Boosting
Constrained Optimization of Gradient Boosting models which is written on top of Sklearn gradient boosting.
Explore the docs »
Table of Contents
About The Project
This is the companion code for the Master Thesis entitled "". The master thesis is done in Bosch Center of AI research, and licenced by GNU AFFERO GENERAL PUBLIC LICENSE. The code allows the users to apply constrained for one type of error, such as false negative rate to do safe classification using gradient boosting. Besides, one can reproduce the results in the paper as it is provided in the examples.
This library enables user to define their own constraints and apply them on for the gradient boosting. To see how to do this visit here.
Built With
This project language is Python and it is built on top of scikit-learn
gradient boosting. For hyper-parameter optimization GPyOpt
bayesian optimization is used.
Getting Started
This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.
Prerequisites
To use the constrained_gb
library, you need to have scikit-learn>=0.22.0
installed,
which is probably installed if you are using Machine Learning algorithm in Python.
To do hyper-parameter optimization using .optimize()
, you need to have GPyOpt
installed.
To install GPyOpt
simply run
pip install gpyopt
If you have problem with GPyOpt
installation visit here.
Installation
- Clone the repo
git clone https://github.com/maryami66/constrained_gb.git
- Install
pip install constrained_gb
Usage
In this example, we are looking for a classifier to Use this space to show useful examples of how a project can be used. Additional screenshots, code examples and demos work well in this space. You may also link to more resources.
import constrained_gb as gbmco
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import *
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=2)
constraints = [gbmco.FalseNegativeRate(0.001)]
parms = {'constraints': constraints,
'multiplier_stepsize': 0.01,
'learning_rate': 0.1,
'min_samples_split': 99,
'min_samples_leaf': 19,
'max_depth': 8,
'max_leaf_nodes': None,
'min_weight_fraction_leaf': 0.0,
'n_estimators': 300,
'max_features': 'sqrt',
'subsample': 0.7,
'random_state': 2
}
clf = gbmco.ConstrainedClassifier(**parms)
clf.fit(X_train, y_train)
test_predictions = clf.predict(X_test)
print("Test F1 Measure: {} \n".format(f1_score(y_test, test_predictions)))
print("Test FNR: {} \n".format(1-recall_score(y_test, test_predictions)))
License
Distributed under the GNU AFFERO GENERAL PUBLIC LICENSE License v3 or later (GPLv3+). See LICENSE
for more information.
Contact
Maryam Bahrami - maryami_66@yahoo.com
Project Link: https://github.com/maryami66/constrained_gb
Acknowledgements
- My master thesis supervisor at Bosch center of AI research, Andreas Steimer
- My master thesis supervisor at Hildesheim University, Lukas Brinkmeyer
- My professor at Hildesheim University, Prof. Lars Schmidt-Thieme
- My friend at BCAI, Damir Shakirov, who guided me for hyper-parameter optimization with Bayesian Optimization.
- Img Shields
- Best-README-Template
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for constrained_gb-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3fef8bf054c23395783d29eb9abafa8ef4a67684ddb330c8139270151fa414e7 |
|
MD5 | 9f1d1ba66108065f9295731d472721e6 |
|
BLAKE2b-256 | 5181c604a3c5cfb03b0449ca39ad6f87a6d8315be3a33444296b25c2750d43b3 |