Skip to main content

Survival analysis built on top of scikit-learn

Project description

License Linux Build Status Windows Build Status on AppVeyor codecov Codacy Badge readthedocs.org Digital Object Identifier (DOI)

scikit-survival is a Python module for survival analysis built on top of scikit-learn. It allows doing survival analysis while utilizing the power of scikit-learn, e.g., for pre-processing or doing cross-validation.

About Survival Analysis

The objective in survival analysis (also referred to as time-to-event or reliability analysis) is to establish a connection between covariates and the time of an event. What makes survival analysis differ from traditional machine learning is the fact that parts of the training data can only be partially observed – they are censored.

For instance, in a clinical study, patients are often monitored for a particular time period, and events occurring in this particular period are recorded. If a patient experiences an event, the exact time of the event can be recorded – the patient’s record is uncensored. In contrast, right censored records refer to patients that remained event-free during the study period and it is unknown whether an event has or has not occurred after the study ended. Consequently, survival analysis demands for models that take this unique characteristic of such a dataset into account.

Requirements

  • Python 3.5 or later

  • cvxpy

  • cvxopt

  • joblib

  • numexpr

  • numpy 1.12 or later

  • osqp

  • pandas 0.21 or later

  • scikit-learn 0.22 or 0.23

  • scipy 1.0 or later

  • C/C++ compiler

Installation

The easiest way to install scikit-survival is to use Anaconda by running:

conda install -c sebp scikit-survival

Alternatively, you can install scikit-survival from source following this guide.

Examples

The following examples are available as Jupyter notebook:

Help and Support

Documentation

Bug reports

  • If you encountered a problem, please submit a bug report.

Questions

  • For general theoretical or methodological questions on survival analysis, please use Cross Validated.

Contributing

New contributors are always welcome. Please have a look at the contributing guidelines on how to get started and to make sure your code complies with our guidelines.

References

Please cite the following papers if you are using scikit-survival.

1. Pölsterl, S., Navab, N., and Katouzian, A., Fast Training of Support Vector Machines for Survival Analysis. Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2015, Porto, Portugal, Lecture Notes in Computer Science, vol. 9285, pp. 243-259 (2015)

2. Pölsterl, S., Navab, N., and Katouzian, A., An Efficient Training Algorithm for Kernel Survival Support Vector Machines. 4th Workshop on Machine Learning in Life Sciences, 23 September 2016, Riva del Garda, Italy

3. Pölsterl, S., Gupta, P., Wang, L., Conjeti, S., Katouzian, A., and Navab, N., Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients. F1000Research, vol. 5, no. 2676 (2016).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit-survival-0.13.0.tar.gz (1.8 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page