Package for ensembling models together
Project description
Ensemblizer
ensemblizer is a small package for various ensemble methods
ModelCollection is a simple way to aggregate models so that they can be trained together.
CatEnsemble takes a ModelCollection object (or the same input as one) with an additional ensemble model. The ensemble model will train on the predicted probabilities (or just predictions) of the ModelCollection and make a separate prediction. This ensemble method allows for different weighting schemes of the outputs of each model as well as stacking the original data with the predictions when training the ensemble model. Additionally, this ensemble class allows you to pretrain the ModelCollection object prior to training the ensemble model or training the ModelCollection object with the ensemble model. This allows for more efficient hyperparameter tuning (albeit much longer training and tuning times).
Hyperparameters can be trained like any other scikit-learn-esque model. When setting parameters for the ensemble model, parameters that begin with name__ will set the hyperparameter of the name model in the collection or ensemble model (default name for ensemble model is "ensemble"). Parameters that start with __name will update the weight of the name model in the collection. This allows the weighting scheme of the ensemble to be tuned with all other parameters using any scikit-learn tuning package.
Current Version is v0.04
This package is currently in the beginning stages but future work is planned.
Installation
pip install ensemblizer
Usage
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import MultinomialNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from ensemblizer import ModelCollection, CatEnsemble
np.random.seed(0)
x_train = np.random.randint(0, 10, (80, 3))
x_test = np.random.randint(0, 10, (20, 3))
y_train = np.random.randint(0, 2, 80)
y_test = np.random.randint(0, 2, 20)
models = ModelCollection([('log', LogisticRegression(random_state=0)),('nb', MultinomialNB())])
test.fit(x_train, y_train)
ensemble = CatEnsemble(test, KNeighborsClassifier())
ensemble.fit(x_train, y_train)
test_preds = ensemble.predict(x_test)
print(f"Accuracy on test set is {accuracy_score(y_test, test_preds)}"
#change the C param of the 'log' model to 15, the alpha param of the 'nb' model to 1,
#the n_neighbors param of the ensemble model to 10, and the weight of the 'log' model to 3
ens.set_params('log__C': 15, 'nb__alpha': 1, 'ensemble__n_neighbors': 10, '__log': 3})
ens.fit(x_train, y_train)
test_preds = ensemble.predict(x_test)
print(f"Accuracy on test set is {accuracy_score(y_test, test_preds)}"
Future Plans
The next step is to create an ensemble regressor model.
License
Lol
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ensemblizer-0.6.tar.gz
.
File metadata
- Download URL: ensemblizer-0.6.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9eb2c459dfa790e3f97c2ed3e566328ddfc3d80ec8e08c8af624c7b60305a899 |
|
MD5 | c761d8076ce30c7ebed4d2c4f1ccfddf |
|
BLAKE2b-256 | 2d20e8fad5973e33f4c3fea3eb5279fe216168a49acac675e96f75e690aeb62e |