Ceteris Paribus python package
Project description
pyCeterisParibus
Python library for Ceteris Paribus Plots. See original R package: https://github.com/pbiecek/ceterisParibus
Setup
Tested on Python 3.5+
PyCeterisParibus is on PyPI. Simply run:
pip install pyCeterisParibus
or install the newest version from GitHub by executing:
pip install git+https://github.com/ModelOriented/pyCeterisParibus
or download the sources, enter the main directory and perform:
https://github.com/ModelOriented/pyCeterisParibus.git
cd pyCeterisParibus
python setup.py install # (alternatively use pip install .)
Docs
Latest documentation is hosted here:
https://pyceterisparibus.readthedocs.io
To build the documentation locally:
cd docs
make html
and open _build/html/index.html
How to use Ceteris Paribus profiles?
Prepare data
df = pd.read_csv('../datasets/insurance.csv')
df = df[['age', 'bmi', 'children', 'charges']]
x = df.drop(['charges'], inplace=False, axis=1)
y = df['charges']
var_names = list(x.columns)
x = x.values
y = y.values
Train models
def linear_regression_model():
linear_model = LinearRegression()
linear_model.fit(x, y)
# model, data, labels, variable_names
return linear_model, x, y, var_names
def gradient_boosting_model():
gb_model = ensemble.GradientBoostingRegressor(n_estimators=1000, random_state=42)
gb_model.fit(x, y)
return gb_model, x, y, var_names
def supported_vector_machines_model():
svm_model = svm.SVR(C=0.01, gamma='scale', kernel='poly')
svm_model.fit(x, y)
return svm_model, x, y, var_names
Wrap models into explainers objects
(linear_model, data, labels, variable_names) = linear_regression_model()
(gb_model, _, _, _) = gradient_boosting_model()
(svm_model, _, _, _) = supported_vector_machines_model()
explainer_linear = explain(linear_model, variable_names, data, y)
explainer_gb = explain(gb_model, variable_names, data, y)
explainer_svm = explain(svm_model, variable_names, data, y)
Single variable response
from ceteris_paribus.profiles import individual_variable_profile
from ceteris_paribus.plots.plots import plot_d3
cp = individual_variable_profile(explainer_gb, x[0], y[0], variables={'bmi'})
plot(cp, show_residuals=True)
Local fit
from ceteris_paribus.select_data import select_neighbours
neighbours_x, neighbours_y = select_neighbours(x, x[0], y=y, n=15)
cp_2 = individual_variable_profile(explainer_gb,
neighbours_x, neighbours_y)
plot(cp_2, show_residuals=True, selected_variables=["bmi"])
Average response
plot(cp_2, aggregate_profiles="mean", selected_variables=["age"])
Many variables
plot(cp_1, selected_variables=["bmi", "age", "children"])
Many models
cp_svm = individual_variable_profile(explainer_svm, x[0], y[0])
cp_linear = individual_variable_profile(explainer_linear, x[0], y[0])
plot(cp_1, cp_svm, cp_linear)
Model interactions
plot(cp_2, color="bmi")
Multiclass models (classification problem)
Prepare dataset and model
iris = load_iris()
def random_forest_classifier():
rf_model = ensemble.RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(iris['data'], iris['target'])
return rf_model, iris['data'], iris['target'], iris['feature_names']
Wrap model into explainers
rf_model, iris_x, iris_y, iris_var_names = random_forest_classifier()
explainer_rf1 = explain(rf_model, iris_var_names, iris_x, iris_y,
predict_function= lambda X: rf_model.predict_proba(X)[::, 0], label=iris.target_names[0])
explainer_rf2 = explain(rf_model, iris_var_names, iris_x, iris_y,
predict_function= lambda X: rf_model.predict_proba(X)[::, 1], label=iris.target_names[1])
explainer_rf3 = explain(rf_model, iris_var_names, iris_x, iris_y,
predict_function= lambda X: rf_model.predict_proba(X)[::, 2], label=iris.target_names[2])
Calculate profiles and plot
cp_rf1 = individual_variable_profile(explainer_rf1, iris_x[0], iris_y[0])
cp_rf2 = individual_variable_profile(explainer_rf2, iris_x[0], iris_y[0])
cp_rf3 = individual_variable_profile(explainer_rf3, iris_x[0], iris_y[0])
plot(cp_rf1, cp_rf2, cp_rf3, selected_variables=['petal length (cm)', 'petal width (cm)', 'sepal length (cm)'])
Acknowledgments
Work on this package was financially supported by the ‘NCN Opus grant 2016/21/B/ST6/0217’.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyCeterisParibus-0.4.tar.gz
.
File metadata
- Download URL: pyCeterisParibus-0.4.tar.gz
- Upload date:
- Size: 46.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.5.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20449717305f704a3bd5b77c696a504327963c3d4dfc5d5ab09982895fd14b71 |
|
MD5 | 63d14107fccf6ce5ee8544d748c4d392 |
|
BLAKE2b-256 | fe4b03265ff3490b7dd273196905d845e1689b020737506f018c619a961bf5ce |
File details
Details for the file pyCeterisParibus-0.4-py3-none-any.whl
.
File metadata
- Download URL: pyCeterisParibus-0.4-py3-none-any.whl
- Upload date:
- Size: 52.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.5.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed8f5f33738570794371dcd719d89c195366ef015406206de641bfbb26bb0ad7 |
|
MD5 | 02d04384d5281ce4590f4fea962ba3cb |
|
BLAKE2b-256 | d8ad03b5e26332ea807849c58b06f4debefb13a14a757829b018a621cf445462 |