CoxKAN: Kolmogorov-Arnold Networks for Survival Analysis
Project description
CoxKAN: Kolmogorov-Arnold Networks for Survival Analysis
Installation • Usage • Datasets • Reproducibility • Credits
CoxKAN leverages Kolmogorov-Arnold Networks for Interpretable, High Performance Survival Analysis.
- Installation:
pip install coxkan
- Documentation: Read-the-Docs.
- Quick-start:
tutorials/intro.ipynb
Repo Structure:
├── checkpoints/ # Results / checkpoints from paper
├── configs/ # Model configuration files
├── coxkan/ # CoxKAN package
├── data/ # Data
├── docs/ # Documentation
├── media/ # Figures used in paper
├── reprod/ # Reproducability instructions/code
├── tutorials/ # Tutorials for CoxKAN
|
# standard stuff:
├── .gitignore
├── LICENSE
├── README.md
└── setup.py
Installation
Pip
CoxKAN can be installed via:
pip install coxkan
Git
Alternatively, may desire the full codebase and environment that was used to produce all results in the associated paper:
git clone https://github.com/knottwill/CoxKAN.git
cd CoxKAN
pip install -r reprod/requirements.txt
Please refer to reproducibility instructions in reprod/README.md
.
Usage
Find tutorials in tutorials/
or Read-the-Docs
Example
from coxkan import CoxKAN
from coxkan.datasets import metabric
df_train, df_test = metabric.load(split=True)
ckan = CoxKAN(width=[len(metabric.covariates), 1])
_ = ckan.train(
df_train,
df_test,
duration_col='duration',
event_col='event',
steps=100)
# evaluate model
ckan.cindex(df_test)
>>> 0.6441975461899737
CoxKAN Package
The coxkan/
package has 4 main components:
coxkan/
├── datasets/ # datasets subpackage
├── CoxKAN.py # CoxKAN model
├── utils.py # utility functions
└── hyperparam_search.py # hyperparameter searching
Datasets
Synthetic Datasets
coxkan.datasets.create_dataset
makes it easy to generate synthetic survival data assuming a proportional-hazards, time-independant hazard function: $$h = h_0 e^{\theta(\mathbf{x})} \rightarrow T_s \sim \text{Exp}(h)$$ and uniform censoring distribution $T_c \sim \text{Uniform}(0, T_{max})$.
In the example below, we use a log-partial hazard of $\theta(\mathbf{x}) = 5 e^{-2(x_1^2 + x_2^2)}$ and a baseline hazard of $h_0 = 0.01$.
from coxkan.datasets import create_dataset
log_partial_hazard = lambda x1, x2: 5*np.exp(-2*(x1**2 + x2**2))
df = create_dataset(log_partial_hazard, baseline_hazard=0.01, n_samples=10000)
Clinical Datasets
5 clinical datasets are available with the coxkan.datasets
subpackage (inspired by pycox). For example:
from coxkan.datasets import gbsg
df_train, df_test = gbsg.load(split=True)
You can decide where to store them using the COXKAN_DATA_DIR
environment variable.
Dataset | Description | Source |
---|---|---|
GBSG | The Rotterdam & German Breast Cancer Study Group. | DeepSurv |
METABRIC | The Molecular Taxonomy of Breast Cancer International Consortium. | DeepSurv |
SUPPORT | Study to Understand Prognoses Preferences Outcomes and Risks of Treatment. | DeepSurv |
NWTCO | National Wilm's Tumor Study. | Rdatasets |
FLCHAIN | Assay of Serum Free Light Chain. | Rdatasets |
Unfortunately, DeepSurv did not retain the column names. We manually restored the names by obtaining the datasets elsewhere and comparing the columns (then we can use the same train/test split):
- GBSG: https://www.kaggle.com/datasets/utkarshx27/breast-cancer-dataset-used-royston-and-altman
- SUPPORT: https://hbiostat.org/data/repo/support2csv.zip
- METABRIC: https://www.kaggle.com/datasets/raghadalharbi/breast-cancer-gene-expression-profiles-metabric
Genomics Datasets
We curated 4 genomics datasets from The Cancer Genome Atlas Program (TCGA). The raw or pre-processed data is available by request - please email me at knottenbeltwill@gmail.com.
Two of the datasets (GBMLGG, KIRC) were the unaltered datasets used in Pathomic Fusion
Dataset | Description | Source |
---|---|---|
STAD | Stomach Adenocarcinoma. | TCGA |
BRCA | Breast Invasive Carcinoma. | TCGA |
GBM/LGG | Merged dataset from two types of brain cancer: Glioblastoma Multiforme and Lower Grade Glioma. | Chen et al. |
KIRC | Kidney Renal Clear Cell Carcinoma. | Chen et al. |
Reproducibility
All results in the associated paper can be reproduced using the codes in reprod/
. Please refer to the instructions in reprod/README.md
.
Credits
Special thanks to:
- All authors of Kolmogorov-Arnold Networks and the incredible pykan package.
- Håvard Kvamme for pycox and torchtuples.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file coxkan-0.0.2.tar.gz
.
File metadata
- Download URL: coxkan-0.0.2.tar.gz
- Upload date:
- Size: 33.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f6ac0b8e158d3dd52d272d83921358206e272a8fc8439cc8a578e9435b8c9af |
|
MD5 | e60fdebf67b586b4abcc8fd623f9c998 |
|
BLAKE2b-256 | 25b6cde504d5f2a15b469aaf5556555336062dbd983f548576636b06f8a782b5 |
File details
Details for the file coxkan-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: coxkan-0.0.2-py3-none-any.whl
- Upload date:
- Size: 33.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 357949d478c364d8c8ebe80eb1bf03729b42c3becd506fcabb136d6ec5aa0b45 |
|
MD5 | 440f6d204b013f9c14847721b35fbfc0 |
|
BLAKE2b-256 | 55653bbfbb1f2fe46d13c5f18fde2d4d3a55c36e2d3ed8745d508ba758ac0109 |