Skip to main content

A MEDS PyTorch Dataset, leveraging a on-the-fly retrieval strategy for flexible, efficient data loading.

Project description

Clinical Zero-Shot Labeler

A tool for adapting ACES (Automated Cohort and Event Selection) task schemas to zero-shot labeling of clinical sequences.

Overview

The Clinical Zero-Shot Labeler extends ACES task schemas, originally designed for cohort extraction and binary classification tasks, to work with generative models. This allows you to:

  1. Use existing ACES task definitions for generative tasks
  2. Control sequence generation using ACES predicates and windows
  3. Extract labels from generated sequences using ACES criteria

By leveraging the ACES schema, you can define complex clinical tasks like:

  • ICU mortality prediction
  • Lab value forecasting
  • Readmission risk assessment
  • etc.

All without needing to modify code or retrain models, and maintaining compatibility with existing ACES configurations.

Installation

pip install clinical-zeroshot-labeler

Quick Start

  1. Define your task in YAML:
predicates:
  hospital_discharge:
    code: {regex: HOSPITAL_DISCHARGE//.*}
  lab:
    code: {regex: LAB//.*}
  abnormal_lab:
    code: {regex: LAB//.*}
    value_min: 2.0
    value_min_inclusive: true

trigger: hospital_discharge

windows:
  input:
    start:
    end: trigger
    start_inclusive: true
    end_inclusive: true
    index_timestamp: end
  target:
    start: input.end
    end: start + 4d
    start_inclusive: false
    end_inclusive: true
    has:
      lab: (1, None)
    label: abnormal_lab
  1. Set up your metadata mapping:
import polars as pl

# Load a metadata mapping of medical codes to vocabulary indices your generative model generates
metadata_df = pl.DataFrame(
    {
        "code": [
            "PAD",
            "HOSPITAL_DISCHARGE//MEDICAL",
            "LAB//NORMAL",
            "LAB//HIGH",
        ]
    }
).with_row_index("code/vocab_index")
  1. Process sequences and get labels:
from clinical_zeroshot_labeler import SequenceLabeler

# Initialize labeler
labeler = SequenceLabeler.from_yaml_str(task_config_yaml, metadata_df, batch_size=2)

# Process tokens one at a time
while not labeler.is_finished():
    # Get next tokens from your model
    tokens, times, values = model.generate_next_token(prompts)

    # Update labeler state
    status = labeler.process_step(tokens, times, values)
    print(
        f"Status: {status}"
    )  # Shows 0=Undetermined, 1=Active, 2=Satisfied, 3=Impossible

    # Update your model's prompts as needed
    prompts = tokens

# Get final labels
labels = labeler.get_labels()

See the notebooks/tutorial.ipynb to run the SequenceLabeler on a mocked Generator.

API Reference

SequenceLabeler

Main class for processing sequences and extracting labels:

# Create from YAML string
labeler = SequenceLabeler.from_yaml_str(yaml_str, metadata_df, batch_size)

# Create from YAML file
labeler = SequenceLabeler.from_yaml_file(yaml_path, metadata_df, batch_size)

# Process tokens (returns status tensor)
status = labeler.process_step(tokens, times, values)

# Check if finished
is_done = labeler.is_finished()

# Get labels
labels = labeler.get_labels()

The labeler tracks window states for each sequence:

  • 0: Undetermined - Initial state
  • 1: Active - Currently processing
  • 2: Satisfied - Success/completion
  • 3: Impossible - Failed/invalid

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clinical_zeroshot_labeler-0.0.3.tar.gz (37.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clinical_zeroshot_labeler-0.0.3-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file clinical_zeroshot_labeler-0.0.3.tar.gz.

File metadata

File hashes

Hashes for clinical_zeroshot_labeler-0.0.3.tar.gz
Algorithm Hash digest
SHA256 3528b8aea4570ab310f1830075768151e6f791c21698c9ae7edd4ffe172cc416
MD5 3bc6b0c67717ac80f497d3e39218a49f
BLAKE2b-256 2f0f33edb2705d879a9bc0273e5b78b1a5e2d767d5d9799bf4b6957eea4949b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for clinical_zeroshot_labeler-0.0.3.tar.gz:

Publisher: publish-to-pypi.yml on Oufattole/clinical-zeroshot-labeler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file clinical_zeroshot_labeler-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for clinical_zeroshot_labeler-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f0f1ecf30d39d5252a6ebf38673b09bec75d27ab03d70dd7634f68b6e3fcfdb7
MD5 0a49779de6b8ca102be43e1c9d289ecf
BLAKE2b-256 9b213cd3523662b41585846bbd1916cee1d1959687d583628ce0fc1478f88e28

See more details on using hashes here.

Provenance

The following attestation bundles were made for clinical_zeroshot_labeler-0.0.3-py3-none-any.whl:

Publisher: publish-to-pypi.yml on Oufattole/clinical-zeroshot-labeler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page