An ETL pipeline to extract HIRID data into the MEDS format.
Project description
HIRID MEDS ETL
Warning: This ETL currently needs a lot of resources to run.
This repository contains the ETL (Extract, Transform, Load) code to convert the HIRID dataset into the MEDS ecosystem.
pip install HIRID_MEDS # you can do this locally or via PyPI
# Download your data or set download credentials
MEDS_extract-HIRID root_output_dir=$ROOT_OUTPUT_DIR do_download=true raw_input_dir=$RAW_INPUT_DIR
MEDS-transforms settings
If you want to convert a large dataset, you can use parallelization with MEDS-transforms (the MEDS-transformation step that takes the longest).
Using local parallelization with the hydra-joblib-launcher package, you can set the number of workers:
pip install hydra-joblib-launcher --upgrade
Then, you can set the number of workers as environment variable:
export N_WORKERS=8
Moreover, you can set the number of subjects per shard to balance the parallelization overhead based on how many subjects you have in your dataset:
export N_SUBJECTS_PER_SHARD=100000
Citation
If you use this dataset, please cite the original publication below and the ETL (see cite this repository):
Faltys, M., Zimmermann, M., Lyu, X., Hüser, M., Hyland, S., Rätsch, G., & Merz, T. (2021). HiRID, a high time-resolution ICU dataset (version 1.1.1). PhysioNet. https://doi.org/10.13026/nkwc-js72.
Hyland, S.L., Faltys, M., Hüser, M. et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat Med 26, 364–373 (2020). https://doi.org/10.1038/s41591-020-0789-4
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hirid_meds-0.0.3.tar.gz.
File metadata
- Download URL: hirid_meds-0.0.3.tar.gz
- Upload date:
- Size: 130.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8530ea65f36cece4f21648212176841977b60668673830daf2b9dcd3f944952
|
|
| MD5 |
a7096bc04250f4321504aa2798f914a7
|
|
| BLAKE2b-256 |
94cb9343c5fb5d37ab9254dea93d8018b730293804d91f26600b2f6c080dd58f
|
Provenance
The following attestation bundles were made for hirid_meds-0.0.3.tar.gz:
Publisher:
python-build.yaml on rvandewater/HIRID_MEDS
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hirid_meds-0.0.3.tar.gz -
Subject digest:
c8530ea65f36cece4f21648212176841977b60668673830daf2b9dcd3f944952 - Sigstore transparency entry: 612713831
- Sigstore integration time:
-
Permalink:
rvandewater/HIRID_MEDS@1f08242ac0cfba19411277380646706e71d9d084 -
Branch / Tag:
refs/tags/0.0.3 - Owner: https://github.com/rvandewater
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-build.yaml@1f08242ac0cfba19411277380646706e71d9d084 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hirid_meds-0.0.3-py3-none-any.whl.
File metadata
- Download URL: hirid_meds-0.0.3-py3-none-any.whl
- Upload date:
- Size: 20.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42c51628e0cc0e2adc9f9795a434a6ba9cd8b08d0cc988f5fa0b4492bbd5ceb7
|
|
| MD5 |
effe64d8965963626ff69081906039d6
|
|
| BLAKE2b-256 |
38a0da7f7884ab0bde395fd55899dc60f6932f04bc178ccf9adb908ffb0ec8b7
|
Provenance
The following attestation bundles were made for hirid_meds-0.0.3-py3-none-any.whl:
Publisher:
python-build.yaml on rvandewater/HIRID_MEDS
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hirid_meds-0.0.3-py3-none-any.whl -
Subject digest:
42c51628e0cc0e2adc9f9795a434a6ba9cd8b08d0cc988f5fa0b4492bbd5ceb7 - Sigstore transparency entry: 612713864
- Sigstore integration time:
-
Permalink:
rvandewater/HIRID_MEDS@1f08242ac0cfba19411277380646706e71d9d084 -
Branch / Tag:
refs/tags/0.0.3 - Owner: https://github.com/rvandewater
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-build.yaml@1f08242ac0cfba19411277380646706e71d9d084 -
Trigger Event:
push
-
Statement type: