a library used for stacking based on scikit-learn

These details have not been verified by PyPI

Project links

Homepage

Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Project description

SKNet

Introduction

SKNet is a new type of neural network that is simple in structure but complex in neuron. Each of its neuron is a traditional estimator such as SVM, RF, etc.

Fetaures

We think that such a network has many applicable scenarios. For example:

We don't have enough samples to train neural networks.
We hope to improve the accuracy of the model by means of emsemble.
We hope to learn some new features.
We want to save a lot of parameter-tuning time while getting a stable and good model.

Installation

pip install sknet

Example

Computation Graph

Code

from sknet.sequential import Layer,Sequential,SKNeuron

from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import AdaBoostRegressor
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.svm import LinearSVR
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.neighbors import KNeighborsRegressor


from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split


data = load_breast_cancer()
features = data.data
target = data.target

X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.2, random_state=42)


layer1 = Layer([
    SKNeuron(RandomForestRegressor,params = {"random_state": 0}),
    SKNeuron(GradientBoostingRegressor,params = {"random_state": 0}),
    SKNeuron(AdaBoostRegressor,params = {"random_state": 0}),
    SKNeuron(KNeighborsRegressor),
    SKNeuron(ExtraTreesRegressor,params = {"random_state": 0}),
])

layer2 = Layer([
    SKNeuron(AdaBoostRegressor,params = {"random_state": 0}),
    SKNeuron(LinearSVR,params = {"random_state": 0}),
])

layer3 = Layer([
    SKNeuron(LogisticRegression,params = {"random_state": 0}),
])


model = Sequential([layer1,layer2,layer3],n_splits = 5)
y_pred = model.fit_predict(X_train,y_train, X_test)
print(model.score(y_test,y_pred))


# acc = 0.9736842105263158

How to construct the SkNet

General Considerations（不用严格遵守）

introduce more information on first level
use simpler model for Subquent level

How to introduce more information?

use different estimator
use same estimator but differences parameters(such as random seed, ...)
simpling samples
simpleing features
different feature engineering techniques (one-hot, embedding, ...)

first level Tips

Diversity based on algo
- 2-3 gradient boosted trees(xgboost, H2O, catboost)
- 2-3 Neural Net (keras, pyTorch)
- 1-2 ExtraTree/ Random Forest( sklearn)
- 1-2 Linear models as in Logistic/ridge regression, linearsvm(sklearn)
- 1-2 knn models(sklearn)
- 1 Factorization machine (libfm)
- 1 svm with nonlinear kernel if size/memory allows(sklearn)
- 1 svm with nonlinear kernel if size/memory allows(sklearn)
Diversity based on input data
- Categorical features: One hot, label encoding, target encoding, frequency.
- Numberical features: outliner, binning, derivatives, percentiles, scaling
- Interactions: col1 */+-col2, groupby, unsupervied
- For classification target, we can use regression models in middle level

Subquent level tips

Simpler(or shallower) algo
- gradient boosted tree with small depth(2 or 3)
- linear models with high reglarization
- Extra Trees
- Shallow network
- Knn with BrayCurtis Distance
- Brute forcing a seach for best linear weights based on cv
Feature engineering
- parwise differences between meta features
- row-wise statics like average or stds
- Standard feature selection techniques

Todo

Two or three level stacking
multi-processing
features proxy

Project details

These details have not been verified by PyPI

Project links

Homepage

Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

This version

0.0.2

Nov 13, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DeerNet-0.0.2.tar.gz (4.3 kB view details)

Uploaded Nov 13, 2019 Source

File details

Details for the file DeerNet-0.0.2.tar.gz.

File metadata

Download URL: DeerNet-0.0.2.tar.gz
Upload date: Nov 13, 2019
Size: 4.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.3

File hashes

Hashes for DeerNet-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`a66a676c5a7bba02a021cde4d0f3b93b0d752237a2cdfad5141973b47810dfcc`
MD5	`3599d52a3fdcd9a0f2993bd341b4aadf`
BLAKE2b-256	`2bb7cd51c4098e3fc31e345824f599414aad133817dbfb441cee69a6c8fdc2cd`

See more details on using hashes here.

DeerNet 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SKNet

Introduction

Fetaures

Installation

Example

Computation Graph

Code

How to construct the SkNet

General Considerations（不用严格遵守）

How to introduce more information?

first level Tips

Subquent level tips

Todo

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes