Skip to main content

A MEDS PyTorch Dataset, leveraging a on-the-fly retrieval strategy for flexible, efficient data loading.

Project description

MEDS-torch: Advanced Machine Learning for Electronic Health Records

PyTorch Lightning Config: Hydra Template Python PyPI Hydra Tests Code Quality Contributors Pull Requests License Documentation Status

🚀 Quick Start

Installation

pip install meds-torch

Set up environment variables

# Define data paths
PATHS_KWARGS="paths.data_dir=/CACHED/NESTED/RAGGED/TENSORS/DIR paths.meds_cohort_dir=/PATH/TO/MEDS/DATA/ paths.output_dir=/OUTPUT/RESULTS/DIRECTORY"

# Define task parameters (for supervised learning)
TASK_KWARGS="data.task_name=NAME_OF_TASK data.task_root_dir=/PATH/TO/TASK/LABELS/"

Basic Usage

  1. Train a supervised model (GPU)
meds-torch-train trainer=gpu $PATHS_KWARGS $TASK_KWARGS
  1. Pretrain an autoregressive forecasting model (GPU)
meds-torch-train trainer=gpu $PATHS_KWARGS model=eic_forecasting
  1. Train with a specific experiment configuration
meds-torch-train experiment=experiment.yaml $PATHS_KWARGS $TASK_KWARGS hydra.searchpath=[pkg://meds_torch.configs,/PATH/TO/CUSTOM/CONFIGS]
  1. Override parameters
meds-torch-train trainer.max_epochs=20 data.batch_size=64 $PATHS_KWARGS $TASK_KWARGS
  1. Hyperparameter search
meds-torch-tune trainer=ray callbacks=tune_default hparams_search=ray_tune experiment=triplet_mtr $PATHS_KWARGS $TASK_KWARGS hydra.searchpath=[pkg://meds_torch.configs,/PATH/TO/CUSTOM/CONFIGS/WITH/experiment/triplet_mtr]

Advanced Examples

For detailed examples and tutorials:

  • Check MIMICIV_INDUCTIVE_EXPERIMENTS/README.md for a comprehensive guide to using MEDS-torch with MIMIC-IV data, including data preparation, task extraction, and running experiments with different tokenization and transfer learning methods.
  • See ZERO_SHOT_TUTORIAL/README.md for a rough WIP walkthrough of zero-shot prediction (and please share feedback on improving this! 🙂)

Example Experiment Configuration

Here's a sample experiment.yaml:

# @package _global_

defaults:
  - override /data: pytorch_dataset
  - override /logger: wandb
  - override /model/backbone: triplet_transformer_encoder
  - override /model/input_encoder: triplet_encoder
  - override /model: supervised
  - override /trainer: gpu

tags: [mimiciv, triplet, transformer_encoder]

seed: 0

trainer:
  min_epochs: 1
  max_epochs: 10
  gradient_clip_val: 1.0

data:
  dataloader:
    batch_size: 64
    num_workers: 6
  max_seq_len: 128
  collate_type: triplet
  subsequence_sampling_strategy: to_end

model:
  token_dim: 128
  optimizer:
    lr: 0.001
  backbone:
    n_layers: 2
    nheads: 4
    dropout: 0

logger:
  wandb:
    tags: ${tags}
    group: mimiciv_tokenization

This configuration sets up a supervised learning experiment using a triplet transformer encoder on MIMIC-IV data. Modify this file to suit your specific needs.

🌟 Key Features

  • Flexible ML Pipeline: Utilizes Hydra for dynamic configuration and PyTorch Lightning for scalable training.
  • Advanced Tokenization: Supports multiple strategies for embedding EHR data (Triplet, Text Code, Everything In Code).
  • Supervised Learning: Train models on arbitrary tasks defined in MEDS format data.
  • Transfer Learning: Pretrain models using contrastive learning, forecasting, and other methods, then finetune for specific tasks.
  • Multiple Pretraining Methods: Supports EBCL, OCP, STraTS Value Forecasting, and Autoregressive Observation Forecasting.

🛠 Installation

PyPI

pip install meds-torch

From Source

git clone git@github.com:Oufattole/meds-torch.git
cd meds-torch
pip install -e .

📚 Documentation

For detailed usage instructions, API reference, and examples, visit our documentation.

For a comprehensive demo of our pipeline and to see results from a suite of inductive experiments comparing different tokenization methods and learning approaches, please refer to the MIMICIV_INDUCTIVE_EXPERIMENTS/README.MD file. This document provides detailed scripts and performance metrics.

🧪 Running Experiments

Supervised Learning

bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_supervised.sh $MIMICIV_ROOT_DIR meds-torch

Transfer Learning

# Pretraining
bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_multi_window_pretrain.sh $MIMICIV_ROOT_DIR meds-torch [METHOD]
bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_ar_pretrain.sh $MIMICIV_ROOT_DIR meds-torch [AR_METHOD]

# Finetuning
bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_finetune.sh $MIMICIV_ROOT_DIR meds-torch [METHOD]
bash MIMICIV_INDUCTIVE_EXPERIMENTS/launch_ar_finetune.sh $MIMICIV_ROOT_DIR meds-torch [AR_METHOD]

Replace [METHOD] with one of the following:

  • ocp (Observation Contrastive Pretraining)
  • ebcl (Event-Based Contrastive Learning)
  • value_forecasting (STraTS Value Forecasting)

Replace [AR_METHOD] with one of the following:

  • eic_forecasting (Everything In Code Forecasting)
  • triplet_forecasting (Triplet Forecasting)

These scripts allow you to run various experiments, including supervised learning, different pretraining methods, and finetuning for both standard and autoregressive models.

📞 Support

For questions, issues, or feature requests, please open an issue on our GitHub repository.


MEDS-torch: Advancing healthcare machine learning through flexible, robust, and scalable sequence modeling tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meds_torch-0.0.8.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

meds_torch-0.0.8-py3-none-any.whl (152.3 kB view details)

Uploaded Python 3

File details

Details for the file meds_torch-0.0.8.tar.gz.

File metadata

  • Download URL: meds_torch-0.0.8.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for meds_torch-0.0.8.tar.gz
Algorithm Hash digest
SHA256 146839f054d0dce70ed5bc65576a30ff6f08dc28db5020d81f4b07b490c6ab6e
MD5 388034e689521cb887e9368d97d7b167
BLAKE2b-256 d8157cc71a3c081f2a28135597a61a9897c873b9f919db778bd376fc947d15f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for meds_torch-0.0.8.tar.gz:

Publisher: publish-to-pypi.yml on Oufattole/meds-torch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file meds_torch-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: meds_torch-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 152.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for meds_torch-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 86192228ef5ef3ed124bf442b299da550373d1c93294ca2a0e44f65b58cc79c7
MD5 07c69af3e54d226fec7a398f7a10bb24
BLAKE2b-256 d501f3f15f8021026f18424f58fd29f4f07d6e300ccf69dd384abc86eb69d068

See more details on using hashes here.

Provenance

The following attestation bundles were made for meds_torch-0.0.8-py3-none-any.whl:

Publisher: publish-to-pypi.yml on Oufattole/meds-torch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page