Skip to main content

An ETL pipeline to extract HIRID data into the MEDS format.

Project description

HIRID MEDS ETL

codecov tests code-quality python license PRs contributors Static Badge DOI

Warning: This ETL currently needs a lot of resources to run.

This repository contains the ETL (Extract, Transform, Load) code to convert the HIRID dataset into the MEDS ecosystem.

HiRID is a freely accessible critical care dataset containing data relating to more than 33 thousand patient admissions to the Department of Intensive Care Medicine of the Bern University Hospital, Switzerland (ICU), an interdisciplinary 60-bed unit admitting >6,500 patients per year. The ICU offers the full range of modern interdisciplinary intensive care medicine for adult patients. The dataset was developed in cooperation between the Swiss Federal Institute of Technology (ETH) Zürich, Switzerland and the ICU.

The dataset contains de-identified demographic information and a total of 712 routinely collected physiological variables, diagnostic test results and treatment parameters from more than 33 thousand admissions during the period from January 2008 to June 2016. Data is stored with a uniquely high time resolution of one entry every two minutes.

source: https://hirid.intensivecare.ai/

pip install HIRID_MEDS # you can do this locally or via PyPI
# Download your data or set download credentials
MEDS_extract-HIRID root_output_dir=$ROOT_OUTPUT_DIR do_download=true raw_input_dir=$RAW_INPUT_DIR

MEDS-transforms settings

If you want to convert a large dataset, you can use parallelization with MEDS-transforms (the MEDS-transformation step that takes the longest).

Using local parallelization with the hydra-joblib-launcher package, you can set the number of workers:

pip install hydra-joblib-launcher --upgrade

Then, you can set the number of workers as environment variable:

export N_WORKERS=8

Moreover, you can set the number of subjects per shard to balance the parallelization overhead based on how many subjects you have in your dataset:

export N_SUBJECTS_PER_SHARD=100000

Citation

If you use this dataset, please cite the original publication below and the ETL (see cite this repository):

Faltys, M., Zimmermann, M., Lyu, X., Hüser, M., Hyland, S., Rätsch, G., & Merz, T. (2021). HiRID, a high time-resolution ICU dataset (version 1.1.1). PhysioNet. https://doi.org/10.13026/nkwc-js72.

Hyland, S.L., Faltys, M., Hüser, M. et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat Med 26, 364–373 (2020). https://doi.org/10.1038/s41591-020-0789-4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hirid_meds-0.0.4.tar.gz (129.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hirid_meds-0.0.4-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file hirid_meds-0.0.4.tar.gz.

File metadata

  • Download URL: hirid_meds-0.0.4.tar.gz
  • Upload date:
  • Size: 129.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hirid_meds-0.0.4.tar.gz
Algorithm Hash digest
SHA256 396ace34bdbd436221858283b1a8cc33079b072e7d144c2948974b7ba577ada7
MD5 709a9bdb5a0d851fb9d29788a7eaefa3
BLAKE2b-256 f9037ede010c1e19e573e6236ad71483936d66e2d3d2c4dd18931373669caa9a

See more details on using hashes here.

Provenance

The following attestation bundles were made for hirid_meds-0.0.4.tar.gz:

Publisher: python-build.yaml on rvandewater/HIRID_MEDS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hirid_meds-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: hirid_meds-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hirid_meds-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1f82ad8542dd79b501842aa22547385952cf607a51bda5f1effcbecd2e2ff37c
MD5 b6de23ae36f10272b156dd6c54edfd2c
BLAKE2b-256 cc52ad1c253e8050b643b72c1b63e3e24f2fca755a581525cac5f4daf9775faa

See more details on using hashes here.

Provenance

The following attestation bundles were made for hirid_meds-0.0.4-py3-none-any.whl:

Publisher: python-build.yaml on rvandewater/HIRID_MEDS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page