A MEDS PyTorch Dataset, leveraging a on-the-fly retrieval strategy for flexible, efficient data loading.
Project description
Clinical Zero-Shot Labeler
A tool for adapting ACES (Automated Cohort and Event Selection) task schemas to zero-shot labeling of clinical sequences.
Overview
The Clinical Zero-Shot Labeler extends ACES task schemas, originally designed for cohort extraction and binary classification tasks, to work with generative models. This allows you to:
- Use existing ACES task definitions for generative tasks
- Control sequence generation using ACES predicates and windows
- Extract labels from generated sequences using ACES criteria
By leveraging the ACES schema, you can define complex clinical tasks like:
- ICU mortality prediction
- Lab value forecasting
- Readmission risk assessment
- etc.
All without needing to modify code or retrain models, and maintaining compatibility with existing ACES configurations.
Installation
pip install clinical-zeroshot-labeler
Quick Start
- Define your task in YAML:
predicates:
hospital_discharge:
code: {regex: HOSPITAL_DISCHARGE//.*}
lab:
code: {regex: LAB//.*}
abnormal_lab:
code: {regex: LAB//.*}
value_min: 2.0
value_min_inclusive: true
trigger: hospital_discharge
windows:
input:
start:
end: trigger
start_inclusive: true
end_inclusive: true
index_timestamp: end
target:
start: input.end
end: start + 4d
start_inclusive: false
end_inclusive: true
has:
lab: (1, None)
label: abnormal_lab
- Set up your metadata mapping:
import polars as pl
# Load a metadata mapping of medical codes to vocabulary indices your generative model generates
metadata_df = pl.DataFrame(
{
"code": [
"PAD",
"HOSPITAL_DISCHARGE//MEDICAL",
"LAB//NORMAL",
"LAB//HIGH",
]
}
).with_row_index("code/vocab_index")
- Process sequences and get labels:
from clinical_zeroshot_labeler import SequenceLabeler
# Initialize labeler
labeler = SequenceLabeler.from_yaml_str(task_config_yaml, metadata_df, batch_size=2)
# Process tokens one at a time
while not labeler.is_finished():
# Get next tokens from your model
tokens, times, values = model.generate_next_token(prompts)
# Update labeler state
status = labeler.process_step(tokens, times, values)
print(
f"Status: {status}"
) # Shows 0=Undetermined, 1=Active, 2=Satisfied, 3=Impossible
# Update your model's prompts as needed
prompts = tokens
# Get final labels
labels = labeler.get_labels()
See the notebooks/tutorial.ipynb to run the SequenceLabeler on a mocked Generator.
API Reference
SequenceLabeler
Main class for processing sequences and extracting labels:
# Create from YAML string
labeler = SequenceLabeler.from_yaml_str(yaml_str, metadata_df, batch_size)
# Create from YAML file
labeler = SequenceLabeler.from_yaml_file(yaml_path, metadata_df, batch_size)
# Process tokens (returns status tensor)
status = labeler.process_step(tokens, times, values)
# Check if finished
is_done = labeler.is_finished()
# Get labels
labels = labeler.get_labels()
The labeler tracks window states for each sequence:
0: Undetermined - Initial state1: Active - Currently processing2: Satisfied - Success/completion3: Impossible - Failed/invalid
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clinical_zeroshot_labeler-0.0.3.tar.gz.
File metadata
- Download URL: clinical_zeroshot_labeler-0.0.3.tar.gz
- Upload date:
- Size: 37.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3528b8aea4570ab310f1830075768151e6f791c21698c9ae7edd4ffe172cc416
|
|
| MD5 |
3bc6b0c67717ac80f497d3e39218a49f
|
|
| BLAKE2b-256 |
2f0f33edb2705d879a9bc0273e5b78b1a5e2d767d5d9799bf4b6957eea4949b4
|
Provenance
The following attestation bundles were made for clinical_zeroshot_labeler-0.0.3.tar.gz:
Publisher:
publish-to-pypi.yml on Oufattole/clinical-zeroshot-labeler
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
clinical_zeroshot_labeler-0.0.3.tar.gz -
Subject digest:
3528b8aea4570ab310f1830075768151e6f791c21698c9ae7edd4ffe172cc416 - Sigstore transparency entry: 158431474
- Sigstore integration time:
-
Permalink:
Oufattole/clinical-zeroshot-labeler@c5dde690a15f5df93d8ccecd44573094cde34220 -
Branch / Tag:
refs/tags/0.0.3 - Owner: https://github.com/Oufattole
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@c5dde690a15f5df93d8ccecd44573094cde34220 -
Trigger Event:
push
-
Statement type:
File details
Details for the file clinical_zeroshot_labeler-0.0.3-py3-none-any.whl.
File metadata
- Download URL: clinical_zeroshot_labeler-0.0.3-py3-none-any.whl
- Upload date:
- Size: 20.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0f1ecf30d39d5252a6ebf38673b09bec75d27ab03d70dd7634f68b6e3fcfdb7
|
|
| MD5 |
0a49779de6b8ca102be43e1c9d289ecf
|
|
| BLAKE2b-256 |
9b213cd3523662b41585846bbd1916cee1d1959687d583628ce0fc1478f88e28
|
Provenance
The following attestation bundles were made for clinical_zeroshot_labeler-0.0.3-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on Oufattole/clinical-zeroshot-labeler
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
clinical_zeroshot_labeler-0.0.3-py3-none-any.whl -
Subject digest:
f0f1ecf30d39d5252a6ebf38673b09bec75d27ab03d70dd7634f68b6e3fcfdb7 - Sigstore transparency entry: 158431475
- Sigstore integration time:
-
Permalink:
Oufattole/clinical-zeroshot-labeler@c5dde690a15f5df93d8ccecd44573094cde34220 -
Branch / Tag:
refs/tags/0.0.3 - Owner: https://github.com/Oufattole
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@c5dde690a15f5df93d8ccecd44573094cde34220 -
Trigger Event:
push
-
Statement type: