Survival analysis with PyTorch

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: BSD License
Natural Language
- English
Programming Language
- Python :: 3.6
- Python :: 3.7

Project description

pycox

Time-to-event prediction with PyTorch

Get Started • Methods • Evaluation Criteria • Datasets • Installation • References

pycox is a python package for survival analysis and time-to-event prediction with PyTorch, built on the torchtuples package for training PyTorch models.

The package contains implementations of various survival models, some useful evaluation metrics, and a collection of event-time datasets. In addition, some useful preprocessing tools are available in the pycox.preprocessing module.

Get Started

To get started you first need to install PyTorch. You can then install pycox with

pip install pycox

We then recommend to start with THIS INTRODUCTION NOTEBOOK, which explains the general usage of the package in terms of preprocessing, creation of neural networks, model training, and evaluation procedure. The notebook use the LogisticHazard method for illustration, but most of the principles generalize to the other methods.

Alternatively, there are many examples listed in the examples folder.

Methods

The following methods are available in the pycox.methods module.

Continuous-Time Models:

Method	Description	Example
CoxTime	Cox-Time is a relative risk model that extends Cox regression beyond the proportional hazards [1].	notebook
CoxCC	Cox-CC is a proportional version of the Cox-Time model [1].	notebook
CoxPH (DeepSurv)	CoxPH is a Cox proportional hazards model also referred to as DeepSurv [2].	notebook
PCHazard	The Piecewise Constant Hazard (PC-Hazard) model [12] assumes that the continuous-time hazard function is constant in predefined intervals. It is similar to the Piecewise Exponential Models [11] and PEANN [14], but with a softplus activation instead of the exponential function.	notebook

Discrete-Time Models:

Method	Description	Example
LogisticHazard (Nnet-survival)	The Logistic-Hazard method parametrize the discrete hazards and optimize the survival likelihood [12] [7]. It is also called Partial Logistic Regression [13] and Nnet-survival [8].	notebook
PMF	The PMF method parametrize the probability mass function (PMF) and optimize the survival likelihood [12]. It is the foundation of methods such as DeepHit and MTLR.	notebook
DeepHit, DeepHitSingle	DeepHit is a PMF method with a loss for improved ranking that can handle competing risks [3].	single competing
MTLR (N-MTLR)	The (Neural) Multi-Task Logistic Regression is a PMF methods proposed by [9] and [10].	notebook

Evaluation Criteria

The following evaluation metrics are available with pycox.evalutation.EvalSurv.

Metric	Description
concordance_td	The time-dependent concordance index evaluated at the event times [4].
brier_score	The IPCW Brier score (inverse probability of censoring weighted Brier score) [5][6].
nbll	The IPCW (negative) binomial log-likelihood [5][1]. I.e., this is minus the binomial log-likelihood and should not be confused with the negative binomial distribution.
integrated_brier_score	The integrated IPCW Brier score. Numerical integration of the `brier_score` [5][6].
integrated_nbll	The integrated IPCW (negative) binomial log-likelihood. Numerical integration of the `nbll` [5][1].

Datasets

A collection of datasets are available through the pycox.datasets module. For example, the following code will download the metabric dataset and load it in the form of a pandas dataframe

from pycox import datasets
df = datasets.metabric.read_df()

Real Datasets:

Dataset	Size	Dataset	Data source
flchain	6,524	The Assay of Serum Free Light Chain (FLCHAIN) dataset. See [1] for preprocessing.	source
gbsg	2,232	The Rotterdam & German Breast Cancer Study Group. See [2] for details.	source
kkbox_v1	2,646,746	A survival dataset created from the WSDM - KKBox's Churn Prediction Challenge 2017. See [1] for details. Note: You need Kaggle credentials to access the dataset.	source
metabric	1,904	The Molecular Taxonomy of Breast Cancer International Consortium (METABRIC). See [2] for details.	source
nwtco	4,028	Data from the National Wilm's Tumor (NWTCO).	source
support	8,873	Study to Understand Prognoses Preferences Outcomes and Risks of Treatment (SUPPORT). See [2] for details.	source

Simulated Datasets:

Dataset	Size	Dataset	Data source
rr_nl_nph	25,000	Dataset from simulation study in [1]. This is a continuous-time simulation study with event times drawn from a relative risk non-linear non-proportional hazards model (RRNLNPH).	SimStudyNonLinearNonPH
sac3	100,000	Dataset from simulation study in [12]. This is a discrete time dataset with 1000 possible event-times.	SimStudySACCensorConst

Installation

Note: This package is still in its early stages of development, so please don't hesitate to report any problems you may experience.

The package only works for python 3.6+.

Before installing pycox, please install PyTorch (version >= 1.1). You can then run the following command to install the package (consider adding --force-reinstall):

pip install git+git://github.com/havakv/pycox.git

Install from Source

Installation from source depends on PyTorch, so make sure a it is installed. Next, clone and install with

git clone https://github.com/havakv/pycox.git
cd pycox
pip install .

References

[1] Håvard Kvamme, Ørnulf Borgan, and Ida Scheel. Time-to-event prediction with neural networks and Cox regression. Journal of Machine Learning Research, 20(129):1–30, 2019. [paper]

[2] Jared L. Katzman, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger. Deepsurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1), 2018. [paper]

[3] Changhee Lee, William R Zame, Jinsung Yoon, and Mihaela van der Schaar. Deephit: A deep learning approach to survival analysis with competing risks. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018. [paper]

[4] Laura Antolini, Patrizia Boracchi, and Elia Biganzoli. A time-dependent discrimination index for survival data. Statistics in Medicine, 24(24):3927–3944, 2005. [paper]

[5] Erika Graf, Claudia Schmoor, Willi Sauerbrei, and Martin Schumacher. Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, 18(17-18):2529–2545, 1999. [paper]

[6] Thomas A. Gerds and Martin Schumacher. Consistent estimation of the expected brier score in general survival models with right-censored event times. Biometrical Journal, 48 (6):1029–1040, 2006. [paper]

[7] Charles C. Brown. On the use of indicator variables for studying the time-dependence of parameters in a response-time model. Biometrics, 31(4):863–872, 1975. [paper]

[8] Michael F. Gensheimer and Balasubramanian Narasimhan. A scalable discrete-time survival model for neural networks. PeerJ, 7:e6257, 2019. [paper]

[9] Chun-Nam Yu, Russell Greiner, Hsiu-Chin Lin, and Vickie Baracos. Learning patient- specific cancer survival distributions as a sequence of dependent regressors. In Advances in Neural Information Processing Systems 24, pages 1845–1853. Curran Associates, Inc., 2011. [paper]

[10] Stephane Fotso. Deep neural networks for survival analysis based on a multi-task framework. arXiv preprint arXiv:1801.05512, 2018. [paper]

[11] Michael Friedman. Piecewise exponential models for survival data with covariates. The Annals of Statistics, 10(1):101–113, 1982. [paper]

[12] Håvard Kvamme and Ørnulf Borgan. Continuous and discrete-time survival prediction with neural networks. arXiv preprint arXiv:1910.06724, 2019. [paper]

[13] Elia Biganzoli, Patrizia Boracchi, Luigi Mariani, and Ettore Marubini. Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Statistics in Medicine, 17(10):1169–1186, 1998. [paper]

[14] Marco Fornili, Federico Ambrogi, Patrizia Boracchi, and Elia Biganzoli. Piecewise exponential artificial neural networks (PEANN) for modeling hazard function with right censored data. Computational Intelligence Methods for Bioinformatics and Biostatistics, pages 125–136, 2014. [paper]

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: BSD License
Natural Language
- English
Programming Language
- Python :: 3.6
- Python :: 3.7

Release history Release notifications | RSS feed

0.3.0

Sep 4, 2024

0.2.3

Jan 14, 2022

0.2.2

Feb 2, 2021

0.2.1

Apr 30, 2020

0.2.0

Dec 19, 2019

This version

0.1.1

Dec 17, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycox-0.1.1.tar.gz (54.8 kB view details)

Uploaded Dec 17, 2019 Source

Built Distribution

pycox-0.1.1-py3-none-any.whl (70.4 kB view details)

Uploaded Dec 17, 2019 Python 3

File details

Details for the file pycox-0.1.1.tar.gz.

File metadata

Download URL: pycox-0.1.1.tar.gz
Upload date: Dec 17, 2019
Size: 54.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.5

File hashes

Hashes for pycox-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`6f4a0f00d6ceaf51a64510025f02db1e471271a76e26a23e28c12106a7e6aa90`
MD5	`8e6e2bd5c146ab4e006c327e822a35a1`
BLAKE2b-256	`e3b7b8b927d17f3919b3f01eaa799345821227e98b9ce783d61f4c7249fa7095`

See more details on using hashes here.

File details

Details for the file pycox-0.1.1-py3-none-any.whl.

File metadata

Download URL: pycox-0.1.1-py3-none-any.whl
Upload date: Dec 17, 2019
Size: 70.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.5

File hashes

Hashes for pycox-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6d38a2a443c967cca6d1179a771dd8a9d04523577e619b2b6c7540223e911f9a`
MD5	`bfc4d4aefdf8d8ce1b7293ed10f38257`
BLAKE2b-256	`4877b2d5bbb286dfcc4da59b3e59a4dacd3ec3c1f97da3d896dbef2dd8eec33d`

See more details on using hashes here.

pycox 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pycox

Get Started

Methods

Continuous-Time Models:

Discrete-Time Models:

Evaluation Criteria

Datasets

Real Datasets:

Simulated Datasets:

Installation

Install from Source

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes