Skip to main content

Learning with Subset Stacking

Project description

Learning with Subset Stacking (LESS)

LESS is a supervised learning algorithm that is based on training many local estimators on subsets of a given dataset, and then passing their predictions to a global estimator. You can find the details about LESS in our manuscript.

LESS

Installation

pip install less-learn

or

conda install -c conda-forge less-learn

(see also conda-smithy repository)

Testing

Here is how you can use LESS:

from sklearn.datasets import make_regression, make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, accuracy_score
from less import LESSRegressor, LESSClassifier

### CLASSIFICATION ###

X, y = make_classification(n_samples=1000, n_features=20, n_classes=3, \
                           n_clusters_per_class=2, n_informative=10, random_state=42)

# Train and test split
X_train, X_test, y_train, y_test = \
    train_test_split(X, y, test_size=0.3, random_state=42)

# LESS fit() & predict()
LESS_model = LESSClassifier(random_state=42)
LESS_model.fit(X_train, y_train)
y_pred = LESS_model.predict(X_test)
print('Test accuracy of LESS: {0:.2f}'.format(accuracy_score(y_pred, y_test)))


### REGRESSION ###

X, y = make_regression(n_samples=1000, n_features=20, random_state=42)

# Train and test split
X_train, X_test, y_train, y_test = \
    train_test_split(X, y, test_size=0.3, random_state=42)

# LESS fit() & predict()
LESS_model = LESSRegressor(random_state=42)
LESS_model.fit(X_train, y_train)
y_pred = LESS_model.predict(X_test)
print('Test error of LESS: {0:.2f}'.format(mean_squared_error(y_pred, y_test)))

Tutorials

Our two-part tutorial on Colab aims at getting you familiar with LESS regression. If you want to try the tutorials on your own computer, then you also need to install the following additional packages: pandas, matplotlib, and seaborn.

Recommendation

Default implementation of LESS uses Euclidean distances with radial basis function. Therefore, it is a good idea to scale the input data before fitting. This can be done by setting the parameter scaling in LESSRegressor or LESSClassifier to True (this is the default value) or by preprocessing the data as follows:

from sklearn.preprocessing import StandardScaler

SC = StandardarScaler()
X_train = SC.fit_transform(X_train)
X_test = SC.transform(X_test)

Citation

Our software can be cited as:

  @misc{LESS,
    author = "Ilker Birbil",
    title = "LESS: LEarning with Subset Stacking",
    year = 2021,
    url = "https://github.com/sibirbil/LESS/"
  }

Parallel Version

An openmpi implementation of LESS is also available in another repository.

Changes in v.0.2.0

  • Classification is added (LESSClassifier)
  • Scaling is automatically done as default (scaling = True)
  • The default global estimator for regression is now DecisionTreeRegressor instead of LinearRegression (global_estimator=DecisionTreeRegressor)
  • Warnings can be turned on or off with a flag (warnings = True)

Changes in v.0.3.0

  • Typos are corrected
  • The hidden class for the binary classifier is now separate
  • Local subsets with a single class are handled (the case of ConstantPredictor)

Acknowledgments

We thank Oguz Albayrak for his help with structuring our Python scripts.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

less-learn-0.3.0.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

less_learn-0.3.0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file less-learn-0.3.0.tar.gz.

File metadata

  • Download URL: less-learn-0.3.0.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for less-learn-0.3.0.tar.gz
Algorithm Hash digest
SHA256 05dc393981d09113e9e2f089ed391d317d0b02279078b4283e22536bf0810f00
MD5 f1208298aaa44397deced56a4aaf4c65
BLAKE2b-256 eea48e0938d8de6096fe020bed329d525ad228b1089291b0f92c8011ed9db4d7

See more details on using hashes here.

File details

Details for the file less_learn-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: less_learn-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for less_learn-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a7bc834af34398370b158eaa7397405f08b93afc5a07961cce5b43ca0767bb11
MD5 13a712f764728e7c9a1b02499804f576
BLAKE2b-256 35867c88c625f53191780373049a0fc85e745c5ae0c89fc6d4c3387492859823

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page