A python package implementing SurvLIME algorithm
Project description
SurvLIMEpy
SurvLIMEpy implements SurvLIME algorithm (Survival Local Interpretable Model-agnostic Explanation), a local interpretable algorithm for Survival Analysis, which was proposed in this paper.
The publication in which we introduce this package will soon be available.
Install
SurvLIMEpy can be installed from PyPI:
pip install survlimepy
How to use
from survlimepy import SurvLimeExplainer
from survlimepy.load_datasets import Loader
from sksurv.linear_model import CoxPHSurvivalAnalysis
# Load the dataset
loader = Loader(dataset_name='veterans')
X, events, times = loader.load_data()
# Train a model
train, test = loader.preprocess_datasets(X, events, times)
model = CoxPHSurvivalAnalysis()
model.fit(train[0], train[1])
# Use SurvLimeExplainer class to find the feature importance
training_features = train[0]
training_events = [event for event, _ in train[1]]
training_times = [time for _, time in train[1]]
explainer = SurvLimeExplainer(
training_features=training_features,
training_events=training_events,
training_times=training_times,
model_output_times=model.event_times_,
)
# explanation variable will have the computed SurvLIME values
explanation = explainer.explain_instance(
data_row=test[0].iloc[0],
predict_fn=model.predict_cumulative_hazard_function,
num_samples=1000,
)
print(explanation)
# Display the weights
explainer.plot_weights()
Model compatibility
Our package can manage multiple types of survival models as long as the functionality that predicts is implemented as a function that takes a vector of size $p$ (the number of features) and outputs a vector of size $q \leq m$ (where $m$ is the number of unique times to event). Most of the packages are compliant with this rule. Therefore, apart from the Cox Proportional Hazards Model, which is implemented in sksurv library, SurvLIMEpy also manages more recent algorithms such as Random Survival Forest, implemented in sksurv library, Survival regression with accelerated failure time model in XGBoost, implemented in xgbse library, DeepHit and DeepSurv, both implemented in pycox library.
We choose to ensure the integration of these algorithms with SurvLIMEpy as they are the most predominant in the field. Note that if a new survival package is developed, SurvLIMEpy will support it as long as the output provided by the predict function is a vector of length $q \leq m$. In this notebook there are several examples with different models.
Citations
Please if you use this package, do not forget to cite us:
@article{pachon2023survlimepy,
title={SurvLIMEpy: A Python package implementing SurvLIME},
author={Pach{\'o}n-Garc{\'\i}a, Cristian and Hern{\'a}ndez-P{\'e}rez, Carlos and Delicado, Pedro and Vilaplana, Ver{\'o}nica},
journal={arXiv preprint arXiv:2302.10571},
year={2023}
}
References
Algorithms
SurvLIME: Maxim S. Kovalev, Lev V. Utkin, & Ernest M. Kasimov (2020). SurvLIME: A method for explaining machine learning survival models. Knowledge-Based Systems, 203, 106164.
Cox Proportional Hazards Model: Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187–202.
Random Survival Forest: Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860. doi:10.1214/08-AOAS169
Survival regression with accelerated failure time model in XGBoost: Barnwal, A., Cho, H., & Hocking, T. (2022). Survival Regression with Accelerated Failure Time Model in XGBoost. Journal of Computational and Graphical Statistics, 0(0), 1–11. doi:10.1080/10618600.2022.2067548
DeepHit: Lee, C., Zame, W., Yoon, J., & van der Schaar, M. (2018). DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). doi:10.1609/aaai.v32i1.11842
DeepSurv: Katzman, J., Shaham, U., Cloninger, A., Bates, J., Jiang, T., & Kluger, Y. (02 2018). DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18. doi:10.1186/s12874-018-0482-1
Datasets
PBC: Therneau, T.M., Grambsch, P.M. (2000). Expected Survival. In: Modeling Survival Data: Extending the Cox Model. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/978-1-4757-3294-8_10
Lung: Loprinzi CL. Laurie JA. Wieand HS. Krook JE. Novotny PJ. Kugler JW. Bartel J. Law M. Bateman M. Klatt NE. et al. Prospective evaluation of prognostic variables from patient-completed questionnaires. North Central Cancer Treatment Group. Journal of Clinical Oncology. 12(3):601-7, 1994.
UDCA: (1) T. M. Therneau and P. M. Grambsch, Modeling survival data: extending the Cox model. Springer, 2000; (2) K. D. Lindor, E. R. Dickson, W. P Baldus, R.A. Jorgensen, J. Ludwig, P. A. Murtaugh, J. M. Harrison, R. H. Weisner, M. L. Anderson, S. M. Lange, G. LeSage, S. S. Rossi and A. F. Hofman. Ursodeoxycholic acid in the treatment of primary biliary cirrhosis. Gastroenterology, 106:1284-1290, 1994.
Veterans: D Kalbfleisch and RL Prentice (1980), The Statistical Analysis of Failure Time Data. Wiley, New York.
Libraries
sksurv: Sebastian Polsterl (2020). scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn. Journal of Machine Learning Research, 21(212), 1-6.
xgbse: Davi Vieira, Gabriel Gimenez, Guilherme Marmerola, & Vitor Estima. (2020). XGBoost Survival Embeddings: improving statistical properties of XGBoost survival analysis implementation.
pycox: Håvard Kvamme, Ørnulf Borgan, and Ida Scheel. Time-to-event prediction with neural networks and Cox regression. Journal of Machine Learning Research, 20(129):1–30, 2019.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file survlimepy-0.0.11.tar.gz
.
File metadata
- Download URL: survlimepy-0.0.11.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | edace849c028f6a47a6794e77c7f4b3602b81fa1409c5d31556e76a5b95929c7 |
|
MD5 | b2f59a801bc070cf3c8b19ef02423ba2 |
|
BLAKE2b-256 | b27615a59a0f14924d31fecdb9c12fa3324d26ad4ebb6a5816e46cb75914e573 |
File details
Details for the file survlimepy-0.0.11-py3-none-any.whl
.
File metadata
- Download URL: survlimepy-0.0.11-py3-none-any.whl
- Upload date:
- Size: 50.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ff09d3bc6adbb00b9147d4b49f9d78416294def627a1a7b8d5edf8a38d654a78 |
|
MD5 | f517f708a187a40b3439234bbf8e35c3 |
|
BLAKE2b-256 | b6e886e0396d66ed65ba8b4bcfcb443fe1e024c3aba9db72382c6821e46693ea |