A python library to build Model Trees with Linear Models at the leaves.
Project description
linear-tree
A python library to build Model Trees with Linear Models at the leaves.
Overview
Linear Trees combine the learning ability of Decision Tree with the predictive and explicative power of Linear Models. Like in tree-based algorithms, the data are split according to simple decision rules. The goodness of slits is evaluated in gain terms fitting Linear Models in the nodes. This implies that the models in the leaves are linear instead of constant approximations like in classical Decision Trees.
Linear Boosting, available in linear-tree package, is a two stage learning process. Firstly, a linear model is trained on the initial dataset to obtains predictions. Secondly, the residuals of the previous step are modeled with a decision tree using all the available features. The tree identifies the path leading to highest error (i.e. the worst leaf). The leaf contributing to the error the most is used to generate a new binary feature to be used in the first stage. The iterations continue until a certain stopping criterion is met.
linear-tree is developed to be fully integrable with scikit-learn. LinearTreeRegressor
and LinearTreeClassifier
are provided as scikit-learn BaseEstimator. They are wrappers that build a decision tree on the data fitting a linear estimator from sklearn.linear_model
. LinearBoostRegressor
and LinearBoostClassifier
are available also as TransformerMixin in order to be integrated, in any pipeline, also for automated features engineering. All the models available in sklearn.linear_model can be used as base learner.
Installation
pip install --upgrade linear-tree
The module depends on NumPy, SciPy and Scikit-Learn (>=0.23.0). Python 3.6 or above is supported.
Media
- Linear Tree: the perfect mix of Linear Model and Decision Tree
- Model Tree: handle Data Shifts mixing Linear Model and Decision Tree
- Explainable AI with Linear Trees
Usage
Linear Tree Regression
from sklearn.linear_model import LinearRegression
from lineartree import LinearTreeRegressor
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=4,
n_informative=2, n_targets=1,
random_state=0, shuffle=False)
regr = LinearTreeRegressor(base_estimator=LinearRegression())
regr.fit(X, y)
Linear Tree Classification
from sklearn.linear_model import RidgeClassifier
from lineartree import LinearTreeClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=100, n_features=4,
n_informative=2, n_redundant=0,
random_state=0, shuffle=False)
clf = LinearTreeClassifier(base_estimator=RidgeClassifier())
clf.fit(X, y)
Linear Boosting Regression
from sklearn.linear_model import LinearRegression
from lineartree import LinearBoostRegressor
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=4,
n_informative=2, n_targets=1,
random_state=0, shuffle=False)
regr = LinearBoostRegressor(base_estimator=LinearRegression())
regr.fit(X, y)
Linear Boosting Classification
from sklearn.linear_model import RidgeClassifier
from lineartree import LinearBoostClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=100, n_features=4,
n_informative=2, n_redundant=0,
random_state=0, shuffle=False)
clf = LinearBoostClassifier(base_estimator=RidgeClassifier())
clf.fit(X, y)
More examples in the notebooks folder.
Check the API Reference to see the parameter configurations and the available methods.
Examples
Show the linear tree learning path:
Linear Tree Regressor at work:
Linear Tree Classifier at work:
Extract and examine coefficients at the leaves:
Impact of the features automatically generated with linear boosting:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for linear_tree-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c8cec8ced87c9aae4d8df7efc5f180e222e65f0c845821996302127ea9011417 |
|
MD5 | 819ed8106d22068f2c3cd9ef2af7b697 |
|
BLAKE2b-256 | 87d2a201411daa5d176bdd80f8e412cc84e7bbe0d37a4ed5bba188225521c69a |