Unofficial implementation of RolexBoost: A Rotation-Based Boosting Algorithm With Adaptive Loss Functions

These details have not been verified by PyPI

Project links

Homepage

Project description

RolexBoost

Unofficial implementation of D. Yang, H. Lee and D. Lim, "RolexBoost: A Rotation-Based Boosting Algorithm With Adaptive Loss Functions," in IEEE Access, vol. 8, pp. 41037-41044, 2020, doi: 10.1109/ACCESS.2020.2976822.

This is the course project of Fundamentals of Machine Learning, Tsinghua University, 2020 Autumn.

Installation

pip install rolexboost

API reference

We provided scikit-learn-like API for the RolexBoost algorithm proposed in the paper, together with the RotationForest and FlexBoost, which are the source of RolexBoost's idea.

Note that

Only classifiers are provided. We did not implemented the regressors because they are not mentioned in the paper, while this project is intended as a reproduction.
We only ensures that the fit and predict API works well. Some others, such as score, may be functional thanks to the scikit-learn BaseEstimator and ClassifierMixin base classes, but still others, such as fit_predict or predict_proba, are currently unavailable.

We might implement those two in the future if someone is interested in this project.

Basic Example

>>> import pandas as pd
>>> import numpy as np
>>> from rolexboost import RolexBoostClassifier, FlexBoostClassifier, RotationForestClassifier

>>> clf = RolexBoostClassifier() # Or the other two classifiers

>>> df = pd.DataFrame({"A": [2,6,5,7,1,8], "B":[8,5,2,3,4,6], "C": [3,9,5,4,6,1], "Label": [0,1,1,0,0,1]})
>>> df
   A  B  C  Label
0  2  8  3      0
1  6  5  9      1
2  5  2  5      1
3  7  3  4      0
4  1  4  6      0
5  8  6  1      1

>>> X = df[["A", "B", "C"]]
>>> y = df["Label"]

>>> clf.fit(X, y)
RolexBoostClassifier()
>>> clf.predict(X)
array([0, 1, 1, 0, 0, 1], dtype=int64)

>>> test_X = np.array([
...     [3,1,2],
...     [2,5,1],
...     [5,1,7]
... ])

>>> clf.predict(test_X)
array([1, 0, 1], dtype=int64)

API References

Rotation Forest

RotationForestClassifier(
    n_estimators=100,
    n_features_per_subset=3,
    bootstrap_rate=0.75,
    criterion="gini",
    splitter="best",
    max_depth=None,
    min_samples_split=2,
    min_samples_leaf=1,
    min_weight_fraction_leaf=0.0,
    max_features=None,
    random_state=None,
    max_leaf_nodes=None,
    min_impurity_decrease=0.0,
    min_impurity_split=None,
    class_weight=None,
    presort="deprecated",
    ccp_alpha=0.0,
)

n_estimators: number of base estimators
n_features_per_subset: number of features in each subset
bootstrap_rate: ratio of samples bootstrapped in the original dataset

All other parameters are passed to the DecisionTreeClassifier of scikit-learn. Please refer to their documentation for details.

Note:

In the algorithm description, a parameter controls the number of subsets, and the number of features is derived from it. However, the validation part of the paper says that "the number of features in each subset was set to three". In our framework, the parameter n_samples_per_subset is thus formulated in this way to make the benchmark evaluation easier.

FlexBoost

FlexBoostClassifier(
    n_estimators=100,
    K=0.5,
    criterion="gini",
    splitter="best",
    max_depth=1,
    min_samples_split=2,
    min_samples_leaf=1,
    min_weight_fraction_leaf=0.0,
    max_features=None,
    random_state=None,
    max_leaf_nodes=None,
    min_impurity_decrease=0.0,
    min_impurity_split=None,
    class_weight=None,
    presort="deprecated",
    ccp_alpha=0.0,
)

n_estimators: number of base estimators
K: the parameter to control the "aggressiveness" and "conservativeness" in the adaptive loss function choosing process. It should be a number between 0 and 1.

All other parameters are passed to the DecisionTreeClassifier of scikit-learn. Please refer to their documentation for details.

The default parameter for max_depth is 1, because FlexBoost is a modification of AdaBoost, and they should converge to the same result when K=1. In scikit-learn implementation of AdaBoost, the default max_depth for the DecisionTreeClassifier is 1.

RolexBoost

RolexBoostClassifier(
    n_estimators=100,
    n_features_per_subset=3,
    bootstrap_rate=0.75,
    K=0.5,
    criterion="gini",
    splitter="best",
    max_depth=1,
    min_samples_split=2,
    min_samples_leaf=1,
    min_weight_fraction_leaf=0.0,
    max_features=None,
    random_state=None,
    max_leaf_nodes=None,
    min_impurity_decrease=0.0,
    min_impurity_split=None,
    class_weight=None,
    presort="deprecated",
    ccp_alpha=0.0
)

n_estimators: number of base estimators
n_features_per_subset: number of features in each subset
bootstrap_rate: ratio of samples bootstrapped in the original dataset
K: the parameter to control the "aggressiveness" and "conservativeness" in the adaptive loss function choosing process. It should be a number between 0 and 1.

All other parameters are passed to the DecisionTreeClassifier of scikit-learn. Please refer to their documentation for details.

Note:

The default parameter for max_depth is 1, because RolexBoost integrates the idea of FlexBoost. Please refer to the last section about why FlexBoost has a default max_depth of 1.

Performance Benchmarks

We have tested the three algorithms on 13 datasets mentioned in the paper.

Here is the result:

algorithm	accuracy	benchmark	ratio
RotationForest	0.7898	0.7947	0.9938
FlexBoost	0.7976	0.8095	0.9853
RolexBoost	0.7775	0.8167	0.9520

accuracy refers to the average accuracy of our implementation
benchmark refers to the average accuracy reported in the paper
ratio is accuracy/benchmark

For the detail of each algorithm on each dataset, please run the tests/accuracy-test.py. The test may take ~1 hour to finish.

Some datasets reported in the paper are not involved in the benchmark testing for the following two reasons:

We cannot find the corresponding dataset in the UCI Machine Learning Repository
The 3-class problems are each divided into three 2-class problems. We are not sure about how such division is done.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.0.1

Sep 20, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rolexboost-0.0.1.tar.gz (8.4 kB view details)

Uploaded Sep 20, 2020 Source

Built Distribution

rolexboost-0.0.1-py3-none-any.whl (9.6 kB view details)

Uploaded Sep 20, 2020 Python 3

File details

Details for the file rolexboost-0.0.1.tar.gz.

File metadata

Download URL: rolexboost-0.0.1.tar.gz
Upload date: Sep 20, 2020
Size: 8.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.7

File hashes

Hashes for rolexboost-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`69210edd0ee9c10709beff78ad0417ee9a0e5a1805d9a979dcca75e1f918e1bc`
MD5	`1b82ad922bd46dc5322822df1dd97891`
BLAKE2b-256	`d3a70827c258f8292dfeb42fff7879050a6dddd798f115aaab9e14f47d4598b2`

See more details on using hashes here.

File details

Details for the file rolexboost-0.0.1-py3-none-any.whl.

File metadata

Download URL: rolexboost-0.0.1-py3-none-any.whl
Upload date: Sep 20, 2020
Size: 9.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.7

File hashes

Hashes for rolexboost-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5cd2d26e0d1bba56f7a47b3d9e2452ac040425dff15844ef9521076b50fa9bb6`
MD5	`0beb36740ebdbaf7a7715ff81e06be9b`
BLAKE2b-256	`d021bef1ef5afce77962e6eed8128395b14988ae7972aabd8361eb79ab0fbbe9`

See more details on using hashes here.

rolexboost 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RolexBoost

Installation

API reference

Basic Example

API References

Rotation Forest

FlexBoost

RolexBoost

Performance Benchmarks

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes