Skip to main content

A couple survival evaluation metrics.

Project description

Survival Evaluation

What is this?

A python package implementing the survival functions found in the paper Effective Ways To Build and Evaluate Individual Survival Distributions by Haider et al. Currently the package only supports the L1-Hinge, L1-Margin, One-Calibration, and D-Calibration evaluation metrics. Future iterations will likely include Concordance and the Brier Score. Note that this package is only for evaluations, all models and predictions must be made prior to utilizing the functions found here. Below is an outline of how to use each of these evaluation metrics, note that the input will differ between all evaluation metrics.

L1-Hinge and L1-Margin

These evaluation functions exist but aren't necessarily recommended as reducing an entire survival distribution to a single point is throwing away a lot of information. For more discussion on these metrics please reference Haider et al.

Below we use a dataset from lifelines and build a cox proportional hazards model. Then we use the predict_expectation function to get the expected survival time to use in the L1-Hinge and L1-Margin calculations. Note that the L1-Margin function requires us to pass in the training set as well because we have to build a Kaplan-Meier function from data not derived from the evaluation dataset.

from lifelines import CoxPHFitter
from lifelines.datasets import load_rossi

import numpy as np
import random
from survival_evaluation import d_calibration, l1, one_calibration

rossi = load_rossi()

# Mix up the dataframe and split it into train and test
np.random.seed(42)
rossi = rossi.sample(frac=1.0)

train = rossi.iloc[:300,:]
test = rossi.iloc[300:,:]

cph = CoxPHFitter()
cph.fit(train, duration_col='week', event_col='arrest')

# Get the expected survival time (in weeks).
survival_predictions = cph.predict_expectation(test)
print(l1(test.week, test.arrest, survival_predictions, l1_type = 'hinge'))
#10.966760923375032

# Margin requires learning the Kaplan-Meier curve from a training set so we must supply that data here.
print(l1(test.week, test.arrest, survival_predictions,train.week,train.arrest, l1_type = 'margin'))
#76.63355163268436

One Calibration

One Calibration requires the survival probability at a specific time point so we can utilize the predict_survival_function function from lifelines and specify a specific time. Note the p-value (0.095) here suggests there is not enough evidence to support the model is not one-calibrated (a p-value below 0.05 suggests the model is not one-calibrated). Additionally we have the observed and expected probabilities (see Figure 7 in Haider et al.). Note there is an error in the text, it claims the plotted values are Oj and njpj -- it is actually Oj/nj and pj.

survival_probabilities = cph.predict_survival_function(test, times=25)
print(one_calibration(test.week, test.arrest, survival_probabilities.iloc[0,:], time= 25))
#{
# 'p_value': 0.0952421895263924,
#  'observed': [0.2857142857142857, 0.1428571428571428, 0.23076923076923073, 0.07692307692307687, 0.07692307692307687, 0.0, 0.15384615384615385, 0.07692307692307687, 0.07692307692307687, 0.15384615384615385],
# 'expected': [0.23576428567246505, 0.17272437148984593, 0.14490381662501145, 0.11879320161654464, 0.1033105922422987, 0.09351181767028925, 0.07903632219123935, 0.06483407664319364, 0.042694369100777056, 0.02164098605624182]
# }

D-Calibration

D-Calibration requires the survival probability at the event time (or censor time). To accomplish this we use the predict_survival_function function from lifelines and get a bit ugly. We want the survival probability of each survival curve at the time the observation either had their event or were censored. To do this we do a list comprehension over the test dataset and grab each prediction. There is probably a much better way to do this so I encourage you to figure that out and then put in a pull request! Note the p-value from D-Calibration gives 0.989 which suggests there is not enough evidence to support the model is not d-calibrated (a p-value below 0.05 suggests the model is not d-calibrated).

Additionally, the proportions for each bin have been included, interpreting this we see the first bin (survival probabilities of [0.0, 0.1)) has a proportion of 0.102 -- see Figures 9 and 15 in Haider et al.. This is further broken out into the uncensored and censored contributions -- see D-Calibration in Haider et al. for details.

survival_probabilities = [cph.predict_survival_function(row, times=row.week).to_numpy()[0][0] for _, row in test.iterrows()]
print(d_calibration(test.arrest, survival_probabilities))
#{
# 'p_value': 0.9899589161837578,
# 'bin_proportions': array([0.10247572, 0.10247572, 0.10247572, 0.10247572, 0.09956549, 0.10563996, 0.10612984, 0.0952593 , 0.06595531, 0.11754722]),
# 'censored_contributions': array([0.10247572, 0.10247572, 0.10247572, 0.10247572, 0.09956549, 0.09048845, 0.07582681, 0.04980476, 0.02050077, 0.00391086]),
# 'uncensored_contributions': array([0.        , 0.        , 0.        , 0.        , 0.      ,0.01515152, 0.03030303, 0.04545455, 0.04545455, 0.11363636])
# }

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

survival_evaluation-0.1.3.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

survival_evaluation-0.1.3-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file survival_evaluation-0.1.3.tar.gz.

File metadata

  • Download URL: survival_evaluation-0.1.3.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.0 Linux/5.4.0-58-generic

File hashes

Hashes for survival_evaluation-0.1.3.tar.gz
Algorithm Hash digest
SHA256 30d1d9740f73a3c555ac1b53d6c0b2ef60c7bd17a96adca21582bd77c74e3cb2
MD5 9401c938fdc589f3b3e479a7f419f232
BLAKE2b-256 b0065dbe761ca398d3648ea4975224d4606cb96dd6646717cfc3937358b64ca4

See more details on using hashes here.

File details

Details for the file survival_evaluation-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for survival_evaluation-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 39f3ac77395c0eafd16f4eae41fd78705a14e80c5bd435a3dd695e80474e8c35
MD5 11e94b52c955687af0973b6c190c14b5
BLAKE2b-256 d6053a6e5d6fcf7fbf8c950bcc83fd46668635e56e0d22d418d675d37a3f5fea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page