Skip to main content

A python package implementing SurvLIME algorithm

Project description

SurvLIMEpy


PyPI Latest Release License Code style: black

SurvLIMEpy implements SurvLIME algorithm (Survival Local Interpretable Model-agnostic Explanation), a local interpretable algorithm for Survival Analysis, which was proposed in this paper.

The publication in which we introduce this package will soon be available.

Install

SurvLIMEpy can be installed from PyPI:

pip install survlimepy

How to use

from survlimepy import SurvLimeExplainer
from survlimepy.load_datasets import Loader
from sksurv.linear_model import CoxPHSurvivalAnalysis

# Load the dataset
loader = Loader(dataset_name='veterans')
X, events, times = loader.load_data()

# Train a model
train, test = loader.preprocess_datasets(X, events, times)
model = CoxPHSurvivalAnalysis()
model.fit(train[0], train[1])

# Use SurvLimeExplainer class to find the feature importance
training_features = train[0]
training_events = [event for event, _ in train[1]]
training_times = [time for _, time in train[1]]

explainer = SurvLimeExplainer(
    training_features=training_features,
    training_events=training_events,
    training_times=training_times,
    model_output_times=model.event_times_,
)

# explanation variable will have the computed SurvLIME values
explanation = explainer.explain_instance(
    data_row=test[0].iloc[0],
    predict_fn=model.predict_cumulative_hazard_function,
    num_samples=1000,
)
print(explanation)

# Display the weights
explainer.plot_weights()

Model compatibility

Our package can manage multiple types of survival models as long as the functionality that predicts is implemented as a function that takes a vector of size $p$ (the number of features) and outputs a vector of size $q \leq m$ (where $m$ is the number of unique times to event). Most of the packages are compliant with this rule. Therefore, apart from the Cox Proportional Hazards Model, which is implemented in sksurv library, SurvLIMEpy also manages more recent algorithms such as Random Survival Forest, implemented in sksurv library, Survival regression with accelerated failure time model in XGBoost, implemented in xgbse library, DeepHit and DeepSurv, both implemented in pycox library.

We choose to ensure the integration of these algorithms with SurvLIMEpy as they are the most predominant in the field. Note that if a new survival package is developed, SurvLIMEpy will support it as long as the output provided by the predict function is a vector of length $q \leq m$. In this notebook there are several examples with different models.

Citations

Please if you use this package, do not forget to cite us:

@article{pachon2023survlimepy,
  title={SurvLIMEpy: A Python package implementing SurvLIME},
  author={Pach{\'o}n-Garc{\'\i}a, Cristian and Hern{\'a}ndez-P{\'e}rez, Carlos and Delicado, Pedro and Vilaplana, Ver{\'o}nica},
  journal={arXiv preprint arXiv:2302.10571},
  year={2023}
}

References

Algorithms

SurvLIME: Maxim S. Kovalev, Lev V. Utkin, & Ernest M. Kasimov (2020). SurvLIME: A method for explaining machine learning survival models. Knowledge-Based Systems, 203, 106164.

Cox Proportional Hazards Model: Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187–202.

Random Survival Forest: Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860. doi:10.1214/08-AOAS169

Survival regression with accelerated failure time model in XGBoost: Barnwal, A., Cho, H., & Hocking, T. (2022). Survival Regression with Accelerated Failure Time Model in XGBoost. Journal of Computational and Graphical Statistics, 0(0), 1–11. doi:10.1080/10618600.2022.2067548

DeepHit: Lee, C., Zame, W., Yoon, J., & van der Schaar, M. (2018). DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). doi:10.1609/aaai.v32i1.11842

DeepSurv: Katzman, J., Shaham, U., Cloninger, A., Bates, J., Jiang, T., & Kluger, Y. (02 2018). DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18. doi:10.1186/s12874-018-0482-1

Datasets

PBC: Therneau, T.M., Grambsch, P.M. (2000). Expected Survival. In: Modeling Survival Data: Extending the Cox Model. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/978-1-4757-3294-8_10

Lung: Loprinzi CL. Laurie JA. Wieand HS. Krook JE. Novotny PJ. Kugler JW. Bartel J. Law M. Bateman M. Klatt NE. et al. Prospective evaluation of prognostic variables from patient-completed questionnaires. North Central Cancer Treatment Group. Journal of Clinical Oncology. 12(3):601-7, 1994.

UDCA: (1) T. M. Therneau and P. M. Grambsch, Modeling survival data: extending the Cox model. Springer, 2000; (2) K. D. Lindor, E. R. Dickson, W. P Baldus, R.A. Jorgensen, J. Ludwig, P. A. Murtaugh, J. M. Harrison, R. H. Weisner, M. L. Anderson, S. M. Lange, G. LeSage, S. S. Rossi and A. F. Hofman. Ursodeoxycholic acid in the treatment of primary biliary cirrhosis. Gastroenterology, 106:1284-1290, 1994.

Veterans: D Kalbfleisch and RL Prentice (1980), The Statistical Analysis of Failure Time Data. Wiley, New York.

Libraries

sksurv: Sebastian Polsterl (2020). scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn. Journal of Machine Learning Research, 21(212), 1-6.

xgbse: Davi Vieira, Gabriel Gimenez, Guilherme Marmerola, & Vitor Estima. (2020). XGBoost Survival Embeddings: improving statistical properties of XGBoost survival analysis implementation.

pycox: Håvard Kvamme, Ørnulf Borgan, and Ida Scheel. Time-to-event prediction with neural networks and Cox regression. Journal of Machine Learning Research, 20(129):1–30, 2019.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

survlimepy-0.0.11.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

survlimepy-0.0.11-py3-none-any.whl (50.7 kB view details)

Uploaded Python 3

File details

Details for the file survlimepy-0.0.11.tar.gz.

File metadata

  • Download URL: survlimepy-0.0.11.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for survlimepy-0.0.11.tar.gz
Algorithm Hash digest
SHA256 edace849c028f6a47a6794e77c7f4b3602b81fa1409c5d31556e76a5b95929c7
MD5 b2f59a801bc070cf3c8b19ef02423ba2
BLAKE2b-256 b27615a59a0f14924d31fecdb9c12fa3324d26ad4ebb6a5816e46cb75914e573

See more details on using hashes here.

File details

Details for the file survlimepy-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: survlimepy-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 50.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for survlimepy-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 ff09d3bc6adbb00b9147d4b49f9d78416294def627a1a7b8d5edf8a38d654a78
MD5 f517f708a187a40b3439234bbf8e35c3
BLAKE2b-256 b6e886e0396d66ed65ba8b4bcfcb443fe1e024c3aba9db72382c6821e46693ea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page