Skip to main content

Learning with Subset Stacking

Project description

Learning with Subset Stacking (LESS)

LESS is a supervised learning algorithm that is based on training many local estimators on subsets of a given dataset, and then passing their predictions to a global estimator. You can find the details about LESS in our manuscript.

LESS

Installation

pip install less-learn

or

conda install -c conda-forge less-learn

(see also conda-smithy repository)

Testing

Here is how you can use LESS:

from sklearn.datasets import make_regression, make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, accuracy_score
from less import LESSRegressor, LESSClassifier

### CLASSIFICATION ###

X, y = make_classification(n_samples=1000, n_features=20, n_classes=3, \
                           n_clusters_per_class=2, n_informative=10, random_state=42)

# Train and test split
X_train, X_test, y_train, y_test = \
    train_test_split(X, y, test_size=0.3, random_state=42)

# LESS fit() & predict()
LESS_model = LESSClassifier(random_state=42)
LESS_model.fit(X_train, y_train)
y_pred = LESS_model.predict(X_test)
print('Test accuracy of LESS: {0:.2f}'.format(accuracy_score(y_pred, y_test)))


### REGRESSION ###

X, y = make_regression(n_samples=1000, n_features=20, random_state=42)

# Train and test split
X_train, X_test, y_train, y_test = \
    train_test_split(X, y, test_size=0.3, random_state=42)

# LESS fit() & predict()
LESS_model = LESSRegressor(random_state=42)
LESS_model.fit(X_train, y_train)
y_pred = LESS_model.predict(X_test)
print('Test error of LESS: {0:.2f}'.format(mean_squared_error(y_pred, y_test)))

Tutorials

Our two-part tutorial on Colab aims at getting you familiar with LESS regression. If you want to try the tutorials on your own computer, then you also need to install the following additional packages: pandas, matplotlib, and seaborn.

Recommendation

Default implementation of LESS uses Euclidean distances with radial basis function. Therefore, it is a good idea to scale the input data before fitting. This can be done by setting the parameter scaling to True (the default value) or preprocessing the data as follows:

from sklearn.preprocessing import StandardScaler

SC = StandardarScaler()
X_train = SC.fit_transform(X_train)
X_test = SC.transform(X_test)

Citation

Our software can be cited as:

  @misc{LESS,
    author = "Ilker Birbil",
    title = "LESS: LEarning with Subset Stacking",
    year = 2021,
    url = "https://github.com/sibirbil/LESS/"
  }

Parallel Version

NOTE: Parallel version of LESS has not been updated yet. Soon...

An openmpi implementation of LESS is also available in another repository.

Changes in v.0.2.0

  • Classification is added (LESSClassifier)
  • Scaling is automatically done as default (scaling = True)
  • The default global estimator for regression is now DecisionTreeRegressor instead of LinearRegression (global_estimator=DecisionTreeRegressor)
  • Warnings can be turned on or off with a flag (warnings = True)

Acknowledgments

We thank Oguz Albayrak for his help with structuring our Python scripts.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

less-learn-0.2.0.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

less_learn-0.2.0-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file less-learn-0.2.0.tar.gz.

File metadata

  • Download URL: less-learn-0.2.0.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for less-learn-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ce100c4df334a2da687ecd22086c237c353ac2d5d75139f59cfb4704083bd546
MD5 8845b2cdccef262cd221df1834f8a54c
BLAKE2b-256 99edaca2ccd82a75b0275abbafa2bb50725382f123429ae7b1d86b77b4cfcdb7

See more details on using hashes here.

File details

Details for the file less_learn-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: less_learn-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for less_learn-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3b7ea5fa8d8ebf458e53f992e4f702bbd7cf07a9111deb93423759e4ad98e9fa
MD5 a3ba54cea9e426982bf6070167a6bf2e
BLAKE2b-256 1f1d39c828b7f3534be448b5da409d1060d08245c38f3bc7b24f64afdd84ecb9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page