Skip to main content

A MEDS PyTorch Dataset, leveraging a on-the-fly retrieval strategy for flexible, efficient data loading.

Project description

Clinical Zero-Shot Labeler

A tool for adapting ACES (Automated Cohort and Event Selection) task schemas to zero-shot labeling of clinical sequences.

Overview

The Clinical Zero-Shot Labeler extends ACES task schemas, originally designed for cohort extraction and binary classification tasks, to work with generative models. This allows you to:

  1. Use existing ACES task definitions for generative tasks
  2. Control sequence generation using ACES predicates and windows
  3. Extract labels from generated sequences using ACES criteria

By leveraging the ACES schema, you can define complex clinical tasks like:

  • ICU mortality prediction
  • Lab value forecasting
  • Readmission risk assessment
  • etc.

All without needing to modify code or retrain models, and maintaining compatibility with existing ACES configurations.

Installation

pip install clinical-zeroshot-labeler

Quick Start

  1. Define your task in YAML:
predicates:
  hospital_discharge:
    code: {regex: HOSPITAL_DISCHARGE//.*}
  lab:
    code: {regex: LAB//.*}
  abnormal_lab:
    code: {regex: LAB//.*}
    value_min: 2.0
    value_min_inclusive: true

trigger: hospital_discharge

windows:
  input:
    start:
    end: trigger
    start_inclusive: true
    end_inclusive: true
    index_timestamp: end
  target:
    start: input.end
    end: start + 4d
    start_inclusive: false
    end_inclusive: true
    has:
      lab: (1, None)
    label: abnormal_lab
  1. Set up your metadata mapping:
import polars as pl

# Load a metadata mapping of medical codes to vocabulary indices your generative model generates
metadata_df = pl.DataFrame(
    {
        "code": [
            "PAD",
            "HOSPITAL_DISCHARGE//MEDICAL",
            "LAB//NORMAL",
            "LAB//HIGH",
        ]
    }
).with_row_index("code/vocab_index")
  1. Process sequences and get labels:
from clinical_zeroshot_labeler import SequenceLabeler

# Initialize labeler
labeler = SequenceLabeler.from_yaml_str(task_config_yaml, metadata_df, batch_size=2)

# Process tokens one at a time
while not labeler.is_finished():
    # Get next tokens from your model
    tokens, times, values = model.generate_next_token(prompts)

    # Update labeler state
    status = labeler.process_step(tokens, times, values)
    print(
        f"Status: {status}"
    )  # Shows 0=Undetermined, 1=Active, 2=Satisfied, 3=Impossible

    # Update your model's prompts as needed
    prompts = tokens

# Get final labels
labels = labeler.get_labels()

See the notebooks/tutorial.ipynb to run the SequenceLabeler on a mocked Generator.

API Reference

SequenceLabeler

Main class for processing sequences and extracting labels:

# Create from YAML string
labeler = SequenceLabeler.from_yaml_str(yaml_str, metadata_df, batch_size)

# Create from YAML file
labeler = SequenceLabeler.from_yaml_file(yaml_path, metadata_df, batch_size)

# Process tokens (returns status tensor)
status = labeler.process_step(tokens, times, values)

# Check if finished
is_done = labeler.is_finished()

# Get labels
labels = labeler.get_labels()

The labeler tracks window states for each sequence:

  • 0: Undetermined - Initial state
  • 1: Active - Currently processing
  • 2: Satisfied - Success/completion
  • 3: Impossible - Failed/invalid

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clinical_zeroshot_labeler-0.0.2.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clinical_zeroshot_labeler-0.0.2-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file clinical_zeroshot_labeler-0.0.2.tar.gz.

File metadata

File hashes

Hashes for clinical_zeroshot_labeler-0.0.2.tar.gz
Algorithm Hash digest
SHA256 5a67961a23b94860b65a94e26de77c3269fad03ec07aee11745c1678be0d10c6
MD5 df64161c28f8c14597ba44d9778a690a
BLAKE2b-256 1fdcfa692442ed05a65b8db12d9b3701135548474d6b90b6933ee7bff00ce8b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for clinical_zeroshot_labeler-0.0.2.tar.gz:

Publisher: publish-to-pypi.yml on Oufattole/clinical-zeroshot-labeler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file clinical_zeroshot_labeler-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for clinical_zeroshot_labeler-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 359f13e9f85857057dc4e390e46254f0951adc383a902871915510dbc0dc5757
MD5 3ae2f7b7434a3010a965dbd98f1b2a67
BLAKE2b-256 dfd5dd5ec3fbb223b3bb8b61a63827b74535a72616547ba237b59152ba0b296d

See more details on using hashes here.

Provenance

The following attestation bundles were made for clinical_zeroshot_labeler-0.0.2-py3-none-any.whl:

Publisher: publish-to-pypi.yml on Oufattole/clinical-zeroshot-labeler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page