Skip to main content

🧪Yet Another ICU Benchmark: a holistic framework for the standardization of clinical prediction model experiments. Provide custom datasets, cohorts, prediction tasks, endpoints, preprocessing, and models. Paper: https://arxiv.org/abs/2306.05109

Project description

YAIB logo

🧪 Yet Another ICU Benchmark

CI Platform arXiv PyPI version shields.io python pytorch lightning License

Yet another ICU benchmark (YAIB) provides a framework for doing clinical machine learning experiments on Intensive Care Unit (ICU) and other Electronic Health Record (EHR) data.

We support the following datasets out of the box:

Dataset MIMIC-III / IV eICU-CRD HiRID AUMCdb
Admissions 40k / 73k 200k 33k 23k
Version v1.4 / v2.2 v2.0 v1.1.1 v1.0.2
Frequency (time-series) 1 hour 5 minutes 2 / 5 minutes up to 1 minute
Originally published 2015 / 2020 2017 2020 2019
Origin USA USA Switzerland Netherlands

New datasets can also be added. We are currently working on a package to make this process as smooth as possible. The benchmark is designed for operating on preprocessed parquet files.

We provide five common tasks for clinical prediction by default:

No Task Frequency Type
1 ICU Mortality Once per Stay (after 24H) Binary Classification
2 Acute Kidney Injury (AKI) Hourly (within 6H) Binary Classification
3 Sepsis Hourly (within 6H) Binary Classification
4 Kidney Function (KF) Once per stay Regression
5 Length of Stay (LoS) Hourly (within 7D) Regression

New tasks can be easily added. To get started right away, we include the eICU and MIMIC-III demo datasets in our repository.

The following repositories may be relevant as well:

📄 Paper

To reproduce the benchmarks in our paper, see the ML reproducibility document. If you use this code in your research, please cite the following publication:

@inproceedings{vandewaterYetAnotherICUBenchmark2024,
  title = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},
  shorttitle = {Yet Another ICU Benchmark},
  booktitle = {The Twelfth International Conference on Learning Representations},
  author = {van de Water, Robin and Schmidt, Hendrik Nils Aurel and Elbers, Paul and Thoral, Patrick and Arnrich, Bert and Rockenschaub, Patrick},
  year = {2024},
  month = oct,
  urldate = {2024-02-19},
  langid = {english},
}

This paper can also be found on arXiv: 2306.05109.

💿 Installation

YAIB is installed from source using uv and the repository's pyproject.toml.

Installation from source

First, clone this repository:

git clone https://github.com/rvandewater/YAIB.git
cd YAIB

Please note the branch. The newest features and fixes are available on the development branch:

git checkout development

Install uv if needed:

curl -LsSf https://astral.sh/uv/install.sh | sh

Pin Python, create the virtual environment, and install all project dependencies:

uv python install 3.12
uv sync

To install development dependencies as well:

uv sync --dev

Platform notes

  • On macOS, YAIB should use the default PyPI torch wheel.
  • On Linux and Windows, the project is configured to resolve the CUDA-enabled torch wheel from the PyTorch index.
  • On Macs with Metal Performance Shaders, run YAIB with PYTORCH_ENABLE_MPS_FALLBACK=1 if needed.

After installation, check that the CLI is available:

uv run icu-benchmarks train -h

YAIB will automatically list available processors at initialization in its log files.

👩‍💻 Usage

Please refer to our wiki for detailed information on how to use YAIB.

Quickstart 🚀 (demo data)

The authors of MIMIC-III and eICU have made a small demo dataset available to demonstrate their use. They can be found on PhysioNet: MIMIC-III Clinical Database Demo and eICU Collaborative Research Database Demo. These datasets are published under the Open Data Commons Open Database License v1.0 and can be used without credentialing procedure. We have created demo cohorts processed solely from these datasets for each of our currently supported task endpoints. To the best of our knowledge, this complies with the license and the respective dataset author's instructions. Usage of the task cohorts and the dataset is only permitted with the above license.

We strongly recommend completing a human subject research training to ensure you properly handle human subject research data.

In the folder demo_data we provide processed publicly available demo datasets from eICU and MIMIC with the necessary labels for Mortality at 24h, Sepsis, Acute Kidney Injury, Kidney Function, and Length of Stay.

If you do not yet have access to the ICU datasets, you can run the following command to train models for the included demo cohorts:

uv run icu-benchmarks train \
    -d demo_data/mortality24/mimic_demo \
    -n mimic_demo \
    -t BinaryClassification \
    -tn Mortality24 \
    -m XGBClassifier \
    --seed 2222 \
    -l ../yaib_logs/ \
    --tune

If you want to reproduce the LightGBM example with cache generation:

uv run icu-benchmarks train \
    -d demo_data/mortality24/mimic_demo \
    -n mimic_demo \
    -t BinaryClassification \
    -tn Mortality24 \
    -m LGBMClassifier \
    -hp LGBMClassifier.min_child_samples=10 \
    --generate_cache \
    --load_cache \
    --seed 2222 \
    -l ../yaib_logs/ \
    --tune

For a list of available flags, run uv run icu-benchmarks train -h.

On Macs with Metal Performance Shaders, run with PYTORCH_ENABLE_MPS_FALLBACK=1 if needed.

For Windows based systems, the next line character (\) needs to be replaced by ^ (Command Prompt) or ` (PowerShell).

Alternatively, the easiest method to train all the models in the paper is to run these commands from the repository root:

uv run wandb sweep --verbose experiments/benchmark_classification.yml
uv run wandb sweep --verbose experiments/benchmark_regression.yml

This will create two hyperparameter sweeps for Weights & Biases for the classification and regression tasks. You can then run the following command to train the models:

uv run wandb agent <sweep_id>

Tip: You can choose to run each of the configurations on a SLURM cluster instance with uv run wandb agent --count 1 <sweep_id>.

Note: You will need to have a Weights & Biases account and be logged in to run the above commands.

Getting the datasets

HiRID, eICU, and MIMIC-IV can be accessed through PhysioNet. A guide to this process can be found here. AUMCdb can be accessed through a separate procedure. We do not have involvement in the access procedure and cannot answer requests for data access.

Cohort creation

Since the datasets were created independently of each other, they do not share the same data structure or data identifiers. In order to make them interoperable, use the preprocessing utilities provided by the ricu package. Ricu pre-defines a large number of clinical concepts and how to load them from a given dataset, providing a common interface to the data that is used in this benchmark. Please refer to our cohort definition code for generating the cohorts using our Python interface for ricu. After this, you can run the benchmark once you have gained access to the datasets.

👟 Running YAIB

Preprocessing and Training

The following command will run training and evaluation on the MIMIC demo dataset for binary mortality prediction at 24h with LGBMClassifier:

uv run icu-benchmarks train \
    -d demo_data/mortality24/mimic_demo \
    -n mimic_demo \
    -t BinaryClassification \
    -tn Mortality24 \
    -m LGBMClassifier \
    -hp LGBMClassifier.min_child_samples=10 \
    --generate_cache \
    --load_cache \
    --seed 2222 \
    -l ../yaib_logs/ \
    --tune

For a list of available flags, run uv run icu-benchmarks train -h.

Run with PYTORCH_ENABLE_MPS_FALLBACK=1 on Macs with Metal Performance Shaders.

For Windows based systems, the next line character (\) needs to be replaced by ^ (Command Prompt) or ` (PowerShell), respectively.

Alternatively, the easiest method to train all the models in the paper is to run these commands from the directory root:

uv run wandb sweep --verbose experiments/benchmark_classification.yml
uv run wandb sweep --verbose experiments/benchmark_regression.yml

This will create two hyperparameter sweeps for WandB for the classification and regression tasks. This configuration will train all the models in the paper. You can then run the following command to train the models:

uv run wandb agent <sweep_id>

Tip: You can choose to run each of the configurations on a SLURM cluster instance by uv run wandb agent --count 1 <sweep_id>.

Note: You will need to have a WandB account and be logged in to run the above commands.

Evaluate or Finetune

It is possible to evaluate a model trained on another dataset without additional training. In this case, the source dataset is the demo data from MIMIC and the target is the eICU demo:

uv run icu-benchmarks \
    --eval \
    -d demo_data/mortality24/eicu_demo \
    -n eicu_demo \
    -t BinaryClassification \
    -tn Mortality24 \
    -m LGBMClassifier \
    --generate_cache \
    --load_cache \
    -s 2222 \
    -l ../yaib_logs \
    -sn mimic \
    --source-dir ../yaib_logs/mimic_demo/Mortality24/LGBMClassifier/2022-12-12T15-24-46/repetition_0/fold_0

A similar syntax is used for finetuning, where a model is loaded and then retrained. To run finetuning, replace --eval with -ft.

Models

We provide several existing machine learning models that are commonly used for multivariate time-series data. pytorch is used for the deep learning models, lightgbm for the boosted tree approaches, and sklearn for other classical machine learning models. The benchmark provides (among others) the following built-in models:

🛠️ Development

To adapt YAIB to your own use case, you can use the development information page as a reference. We appreciate contributions to the project. Please read the contribution guidelines before submitting a pull request.

Acknowledgements

This project has been developed partially under the funding of “Gemeinsamer Bundesausschuss (G-BA) Innovationsausschuss” in the framework of “CASSANDRA - Clinical ASSist AND aleRt Algorithms” (project number 01VSF20015). We would like to acknowledge the work of Alisher Turubayev, Anna Shopova, Fabian Lange, Mahmut Kamalak, Paul Mattes, and Victoria Ayvasky for adding PyTorch Lightning, Weights and Biases compatibility, and several optional imputation methods to a later version of the benchmark repository.

We do not own any of the datasets used in this benchmark. This project uses heavily adapted components of the HiRID benchmark. We thank the authors for providing this codebase and encourage further development to benefit the scientific community. The demo datasets have been released under an Open Data Commons Open Database License (ODbL).

License

This source code is released under the MIT license, included here. We do not own any of the datasets used or included in this repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yaib-1.0.8.tar.gz (95.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yaib-1.0.8-py3-none-any.whl (117.6 kB view details)

Uploaded Python 3

File details

Details for the file yaib-1.0.8.tar.gz.

File metadata

  • Download URL: yaib-1.0.8.tar.gz
  • Upload date:
  • Size: 95.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for yaib-1.0.8.tar.gz
Algorithm Hash digest
SHA256 e3929cd7f4bc7b1129256d0a7c3fa4f565ee4716685246071f1b6fc3fd56be23
MD5 2c860523fbab17952a0c9713a6fb4d3f
BLAKE2b-256 38d384c70ea20f62aae25372358e8d362a2b03384e6f3e0d65c750fda3c0dd27

See more details on using hashes here.

Provenance

The following attestation bundles were made for yaib-1.0.8.tar.gz:

Publisher: python-build.yml on rvandewater/YAIB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yaib-1.0.8-py3-none-any.whl.

File metadata

  • Download URL: yaib-1.0.8-py3-none-any.whl
  • Upload date:
  • Size: 117.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for yaib-1.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 6af7ac9dd2b4d19eb87729be4139065304f78a11d7d15bf601850880b6074e96
MD5 30c1c21a4facd78bcfa9ac5e228e05d5
BLAKE2b-256 f5c468c82fadedab41f81baeae5bbec7462134abccf9665fd1fb16d06416247d

See more details on using hashes here.

Provenance

The following attestation bundles were made for yaib-1.0.8-py3-none-any.whl:

Publisher: python-build.yml on rvandewater/YAIB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page