Skip to main content

Yet Another ICU Benchmark is a holistic framework for the automation of the development of clinical prediction models on ICU data. Users can create custom datasets, cohorts, prediction tasks, endpoints, and models.

Project description

YAIB logo

🧪 Yet Another ICU Benchmark

CI Black Platform arXiv PyPI version shields.io License

Yet another ICU benchmark (YAIB) provides a framework for doing clinical machine learning experiments on Intensive Care Unit ( ICU) EHR data.

We support the following datasets out of the box:

Dataset MIMIC-III / IV eICU-CRD HiRID AUMCdb
Admissions 40k / 73k 200k 33k 23k
Version v1.4 / v2.2 v2.0 v1.1.1 v1.0.2
Frequency (time-series) 1 hour 5 minutes 2 / 5 minutes up to 1 minute
Originally published 2015 / 2020 2017 2020 2019
Origin USA USA Switzerland Netherlands

New datasets can also be added. We are currently working on a package to make this process as smooth as possible. The benchmark is designed for operating on preprocessed parquet files.

We provide five common tasks for clinical prediction by default:

No Task Frequency Type
1 ICU Mortality Once per Stay (after 24H) Binary Classification
2 Acute Kidney Injury (AKI) Hourly (within 6H) Binary Classification
3 Sepsis Hourly (within 6H) Binary Classification
4 Kidney Function(KF) Once per stay Regression
5 Length of Stay (LoS) Hourly (within 7D) Regression

New tasks can be easily added. For the purposes of getting started right away, we include the eICU and MIMIC-III demo datasets in our repository.

The following repositories may be relevant as well:

For all YAIB related repositories, please see: https://github.com/stars/rvandewater/lists/yaib.

📄Paper

To reproduce the benchmarks in our paper, we refer to: the ML reproducibility document. If you use this code in your research, please cite the following publication:

@article{vandewaterYetAnotherICUBenchmark2023,
	title = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},
	shorttitle = {Yet Another ICU Benchmark},
	url = {http://arxiv.org/abs/2306.05109},
	language = {en},
	urldate = {2023-06-09},
	publisher = {arXiv},
	author = {Robin van de Water and Hendrik Schmidt and Paul Elbers and Patrick Thoral and Bert Arnrich and Patrick Rockenschaub},
	month = jun,
	year = {2023},
	note = {arXiv:2306.05109 [cs]},
	keywords = {Computer Science - Machine Learning},
}

This paper can also be found on arxiv 2306.05109

💿Installation

YAIB is currently ideally installed from source, however we also offer it an early PyPi release.

Installation from source

First, we clone this repository using git:

git clone https://github.com/rvandewater/YAIB.git

Please note the branch. The newest features and fixes are available at the development branch:

git checkout development

YAIB can be installed using a conda environment (preferred) or pip. Below are the three CLI commands to install YAIB using conda.

The first command will install an environment based on Python 3.10.

conda env update -f <environment.yml|environment_mps.yml>

Use environment.yml on x86 hardware and environment_mps.yml on Macs with Metal Performance Shaders.

We then activate the environment and install a package called icu-benchmarks, after which YAIB should be operational.

conda activate yaib
pip install -e .

If you want to install the icu-benchmarks package with pip, execute the command below:

pip install torch numpy && pip install -e .

After installation, please check if your Pytorch version works with CUDA (in case available) to ensure the best performance. YAIB will automatically list available processors at initialization in its log files.

👩‍💻Usage

Please refer to our wiki for detailed information on how to use YAIB.

Quickstart 🚀 (demo data)

In the folder demo_data we provide processed publicly available demo datasets from eICU and MIMIC with the necessary labels for Mortality at 24h,Sepsis, Akute Kidney Injury, Kidney Function, and Length of Stay.

If you do not yet have access to the ICU datasets, you can run the following command to train models for the included demo cohorts:

wandb sweep --verbose experiments/demo_benchmark_classification.yml
wandb sweep --verbose experiments/demo_benchmark_regression.yml
wandb agent <sweep_id>

Tip: You can choose to run each of the configurations on a SLURM cluster instance by wandb agent --count 1 <sweep_id>

Note: You will need to have a wandb account and be logged in to run the above commands.

Getting the datasets

HiRID, eICU, and MIMIC IV can be accessed through PhysioNet. A guide to this process can be found here. AUMCdb can be accessed through a separate access procedure. We do not have involvement in the access procedure and can not answer to any requests for data access.

Cohort creation

Since the datasets were created independently of each other, they do not share the same data structure or data identifiers. In order to make them interoperable, use the preprocessing utilities provided by the ricu package. Ricu pre-defines a large number of clinical concepts and how to load them from a given dataset, providing a common interface to the data, that is used in this benchmark. Please refer to our cohort definition code for generating the cohorts using our python interface for ricu. After this, you can run the benchmark once you have gained access to the datasets.

👟 Running YAIB

Preprocessing and Training

The following command will run training and evaluation on the MIMIC demo dataset for (Binary) mortality prediction at 24h with the LGBMClassifier. Child samples are reduced due to the small amount of training data. We load available cache and, if available, load existing cache files.

icu-benchmarks train \
    -d demo_data/mortality24/mimic_demo \
    -n mimic_demo \
    -t BinaryClassification \
    -tn Mortality24 \
    -m LGBMClassifier \
    -hp LGBMClassifier.min_child_samples=10 \
    --generate_cache
    --load_cache \
    --seed 2222 \
    -s 2222 \
    -l ../yaib_logs/ \
    --tune

For a list of available flags, run icu-benchmarks train -h.

Run with PYTORCH_ENABLE_MPS_FALLBACK=1 on Macs with Metal Performance Shaders.

For Windows based systems, the next line character (\) needs to be replaced by (^) (Command Prompt) or (`) (Powershell) respectively.

Alternatively, the easiest method to train all the models in the paper is to run these commands from the directory root:

wandb sweep --verbose experiments/benchmark_classification.yml
wandb sweep --verbose experiments/benchmark_regression.yml

This will create two hyperparameter sweeps for WandB for the classification and regression tasks. This configuration will train all the models in the paper. You can then run the following command to train the models:

wandb agent <sweep_id>

Tip: You can choose to run each of the configurations on a SLURM cluster instance by wandb agent --count 1 <sweep_id>

Note: You will need to have a wandb account and be logged in to run the above commands.

Evaluate

It is possible to evaluate a model trained on another dataset. In this case, the source dataset is the demo data from MIMIC and the target is the eICU demo:

icu-benchmarks evaluate \
    -d demo_data/mortality24/eicu_demo \
    -n eicu_demo \
    -t BinaryClassification \
    -tn Mortality24 \
    -m LGBMClassifier \
    --generate_cache \
    --load_cache \
    -s 2222 \
    -l ../yaib_logs \
    -sn mimic \
    --source-dir ../yaib_logs/mimic_demo/Mortality24/LGBMClassifier/2022-12-12T15-24-46/fold_0

Models

We provide several existing machine learning models that are commonly used for multivariate time-series data. pytorch is used for the deep learning models, lightgbm for the boosted tree approaches, and sklearn for other classical machine learning models. The benchmark provides (among others) the following built-in models:

🛠️ Development

To adapt YAIB to your own use case, you can use the development information page as a reference. We appreciate contributions to the project. Please read the contribution guidelines before submitting a pull request.

Acknowledgements

We do not own any of the datasets used in this benchmark. This project uses heavily adapted components of the HiRID benchmark. We thank the authors for providing this codebase and encourage further development to benefit the scientific community. The demo datasets have been released under an Open Data Commons Open Database License (ODbL).

License

This source code is released under the MIT license, included here. We do not own any of the datasets used or included in this repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yaib-0.3.1.tar.gz (475.6 kB view hashes)

Uploaded Source

Built Distribution

yaib-0.3.1-py2.py3-none-any.whl (16.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page