Skip to main content

CoxKAN: Kolmogorov-Arnold Networks for Survival Analysis

Project description

CoxKAN

CoxKAN: Kolmogorov-Arnold Networks for Survival Analysis

InstallationUsageDatasetsReproducibilityCredits

CoxKAN leverages Kolmogorov-Arnold Networks for Interpretable, High Performance Survival Analysis.

  • Installation: pip install coxkan
  • Documentation: Read-the-Docs.
  • Quick-start: tutorials/intro.ipynb

Repo Structure:

├── checkpoints/        # Results / checkpoints from paper
├── configs/            # Model configuration files
├── coxkan/             # CoxKAN package 
├── data/               # Data 
├── docs/               # Documentation
├── media/              # Figures used in paper
├── reprod/             # Reproducability instructions/code
├── tutorials/          # Tutorials for CoxKAN
|
# standard stuff:
├── .gitignore         
├── LICENSE          
├── README.md          
└── setup.py            

Installation

Pip

CoxKAN can be installed via:

pip install coxkan

Git

Alternatively, may desire the full codebase and environment that was used to produce all results in the associated paper:

git clone https://github.com/knottwill/CoxKAN.git 
cd CoxKAN
pip install -r reprod/requirements.txt

Please refer to reproducibility instructions in reprod/README.md.

Usage

Find tutorials in tutorials/ or Read-the-Docs

Example

from coxkan import CoxKAN
from coxkan.datasets import metabric 

df_train, df_test = metabric.load(split=True)

ckan = CoxKAN(width=[len(metabric.covariates), 1])

_ = ckan.train(
    df_train, 
    df_test, 
    duration_col='duration', 
    event_col='event',
    steps=100)

# evaluate model
ckan.cindex(df_test)
>>> 0.6441975461899737

CoxKAN Package

The coxkan/ package has 4 main components:

coxkan/
    ├── datasets/             # datasets subpackage
    ├── CoxKAN.py             # CoxKAN model
    ├── utils.py              # utility functions
    └── hyperparam_search.py  # hyperparameter searching

Datasets

Synthetic Datasets

coxkan.datasets.create_dataset makes it easy to generate synthetic survival data assuming a proportional-hazards, time-independant hazard function: $$h = h_0 e^{\theta(\mathbf{x})} \rightarrow T_s \sim \text{Exp}(h)$$ and uniform censoring distribution $T_c \sim \text{Uniform}(0, T_{max})$.

In the example below, we use a log-partial hazard of $\theta(\mathbf{x}) = 5 e^{-2(x_1^2 + x_2^2)}$ and a baseline hazard of $h_0 = 0.01$.

from coxkan.datasets import create_dataset

log_partial_hazard = lambda x1, x2: 5*np.exp(-2*(x1**2 + x2**2))
df = create_dataset(log_partial_hazard, baseline_hazard=0.01, n_samples=10000)

Clinical Datasets

5 clinical datasets are available with the coxkan.datasets subpackage (inspired by pycox). For example:

from coxkan.datasets import gbsg
df_train, df_test = gbsg.load(split=True)

You can decide where to store them using the COXKAN_DATA_DIR environment variable.

Dataset Description Source
GBSG The Rotterdam & German Breast Cancer Study Group. DeepSurv
METABRIC The Molecular Taxonomy of Breast Cancer International Consortium. DeepSurv
SUPPORT Study to Understand Prognoses Preferences Outcomes and Risks of Treatment. DeepSurv
NWTCO National Wilm's Tumor Study. Rdatasets
FLCHAIN Assay of Serum Free Light Chain. Rdatasets

Unfortunately, DeepSurv did not retain the column names. We manually restored the names by obtaining the datasets elsewhere and comparing the columns (then we can use the same train/test split):

Genomics Datasets

We curated 4 genomics datasets from The Cancer Genome Atlas Program (TCGA). The raw or pre-processed data is available by request - please email me at knottenbeltwill@gmail.com.

Two of the datasets (GBMLGG, KIRC) were the unaltered datasets used in Pathomic Fusion

Dataset Description Source
STAD Stomach Adenocarcinoma. TCGA
BRCA Breast Invasive Carcinoma. TCGA
GBM/LGG Merged dataset from two types of brain cancer: Glioblastoma Multiforme and Lower Grade Glioma. Chen et al.
KIRC Kidney Renal Clear Cell Carcinoma. Chen et al.

Reproducibility

All results in the associated paper can be reproduced using the codes in reprod/. Please refer to the instructions in reprod/README.md.

Credits

Special thanks to:

  • All authors of Kolmogorov-Arnold Networks and the incredible pykan package.
  • Håvard Kvamme for pycox and torchtuples.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coxkan-0.0.2.tar.gz (33.2 kB view details)

Uploaded Source

Built Distribution

coxkan-0.0.2-py3-none-any.whl (33.0 kB view details)

Uploaded Python 3

File details

Details for the file coxkan-0.0.2.tar.gz.

File metadata

  • Download URL: coxkan-0.0.2.tar.gz
  • Upload date:
  • Size: 33.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for coxkan-0.0.2.tar.gz
Algorithm Hash digest
SHA256 9f6ac0b8e158d3dd52d272d83921358206e272a8fc8439cc8a578e9435b8c9af
MD5 e60fdebf67b586b4abcc8fd623f9c998
BLAKE2b-256 25b6cde504d5f2a15b469aaf5556555336062dbd983f548576636b06f8a782b5

See more details on using hashes here.

File details

Details for the file coxkan-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: coxkan-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 33.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for coxkan-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 357949d478c364d8c8ebe80eb1bf03729b42c3becd506fcabb136d6ec5aa0b45
MD5 440f6d204b013f9c14847721b35fbfc0
BLAKE2b-256 55653bbfbb1f2fe46d13c5f18fde2d4d3a55c36e2d3ed8745d508ba758ac0109

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page