Skip to main content

An ETL pipeline to extract the eICU dataset into the MEDS format.

Project description

eICU MEDS Extraction ETL

PyPI - Version Documentation Status Static Badge codecov tests code-quality python license PRs contributors

This repository contains the code for downloading the eICU dataset from PhysioNet and transforming it into the Medical Event Data Standard (MEDS) format.

pip install eICU-MEDS # use `pip install -e .` for local installation in editing mode
export DATASET_DOWNLOAD_USERNAME=$PHYSIONET_USERNAME
export DATASET_DOWNLOAD_PASSWORD=$PHYSIONET_PASSWORD
MEDS_extract-eICU root_output_dir=data/eicu_meds do_download=False

MEDS-transforms settings

If you want to convert a large dataset, you can use parallelization with MEDS-transforms (the MEDS-transformation step that takes the longest).

Using local parallelization with the hydra-joblib-launcher package, you can set the number of workers:

pip install hydra-joblib-launcher --upgrade

Then, you can set the number of workers as environment variable:

export N_WORKERS=8

Moreover, you can set the number of subjects per shard to balance the parallelization overhead based on how many subjects you have in your dataset:

export N_SUBJECTS_PER_SHARD=100000

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eicu_meds-0.0.2.tar.gz (131.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eicu_meds-0.0.2-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file eicu_meds-0.0.2.tar.gz.

File metadata

  • Download URL: eicu_meds-0.0.2.tar.gz
  • Upload date:
  • Size: 131.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for eicu_meds-0.0.2.tar.gz
Algorithm Hash digest
SHA256 583ee416c088dcf9f6f24bfb4b8e7ac92b373f86769af2530bf6cb495dd34971
MD5 234361cd92abf073e8b89818546417e8
BLAKE2b-256 d6320a449b8a67d823a4479da47fed2536710e6d152a9551bbb50cbbe876499c

See more details on using hashes here.

Provenance

The following attestation bundles were made for eicu_meds-0.0.2.tar.gz:

Publisher: python-build.yaml on Medical-Event-Data-Standard/eICU_MEDS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file eicu_meds-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: eicu_meds-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for eicu_meds-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c36ec62a6fd4b998b96f47896ec3ee7dfdf2f6380b132cdde8b92f2ebb46cac1
MD5 9b4f704a0a526be1b6b9f315519bc93a
BLAKE2b-256 2b36382b291e9231fe9648a2e4799e4c25f5d1c57366f7d2add213a9ad05385e

See more details on using hashes here.

Provenance

The following attestation bundles were made for eicu_meds-0.0.2-py3-none-any.whl:

Publisher: python-build.yaml on Medical-Event-Data-Standard/eICU_MEDS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page