Skip to main content

Merlin: A Vision Language Foundation Model for 3D Computed Tomography

Project description

Merlin: Vision Language Foundation Model for 3D Computed Tomography

arXiv    Hugging Face    pypi    License

Merlin is a 3D VLM for computed tomography that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining.

⚡️ Installation

To install Merlin, you can simply run:

pip install merlin-vlm

For an editable installation, use the following commands to clone and install this repository.

git clone https://github.com/StanfordMIMI/Merlin.git
cd merlin
pip install -e .

🚀 Inference on a demo CT scan

import os
import warnings
import torch

from merlin.data import download_sample_data
from merlin.data import DataLoader
from merlin import Merlin


model = Merlin()
model.eval()
model.cuda()

data_dir = os.path.join(os.path.dirname(merlin.__file__), "abct_data")
cache_dir = data_dir.replace("abct_data", "abct_data_cache")

datalist = [
    {
        "image": download_sample_data(data_dir), # function returns local path to nifti file
        "text": "Lower thorax: A small low-attenuating fluid structure is noted in the right cardiophrenic angle in keeping with a tiny pericardial cyst."
        "Liver and biliary tree: Normal. Gallbladder: Normal. Spleen: Normal. Pancreas: Normal. Adrenal glands: Normal. "
        "Kidneys and ureters: Symmetric enhancement and excretion of the bilateral kidneys, with no striated nephrogram to suggest pyelonephritis. "
        "Urothelial enhancement bilaterally, consistent with urinary tract infection. No renal/ureteral calculi. No hydronephrosis. "
        "Gastrointestinal tract: Normal. Normal gas-filled appendix. Peritoneal cavity: No free fluid. "
        "Bladder: Marked urothelial enhancement consistent with cystitis. Uterus and ovaries: Normal. "
        "Vasculature: Patent. Lymph nodes: Normal. Abdominal wall: Normal. "
        "Musculoskeletal: Degenerative change of the spine.",
    },
]

dataloader = DataLoader(
    datalist=datalist,
    cache_dir=cache_dir,
    batchsize=8,
    shuffle=True,
    num_workers=0,
)

for batch in dataloader:
    outputs = model(
        batch["image"].to(device), 
        batch["text"]
        )
    print(f"\n================== Output Shapes ==================")
    print(f"Contrastive image embeddings shape: {outputs[0].shape}")
    print(f"Phenotype predictions shape: {outputs[1].shape}")
    print(f"Contrastive text embeddings shape: {outputs[2].shape}")

📎 Citation

If you find this repository useful for your work, please cite the cite the original paper:

@article{blankemeier2024merlin,
  title={Merlin: A vision language foundation model for 3d computed tomography},
  author={Blankemeier, Louis and Cohen, Joseph Paul and Kumar, Ashwin and Van Veen, Dave and Gardezi, Syed Jamal Safdar and Paschali, Magdalini and Chen, Zhihong and Delbrouck, Jean-Benoit and Reis, Eduardo and Truyts, Cesar and others},
  journal={Research Square},
  pages={rs--3},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

merlin_vlm-0.0.1.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

merlin_vlm-0.0.1-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file merlin_vlm-0.0.1.tar.gz.

File metadata

  • Download URL: merlin_vlm-0.0.1.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.0 CPython/3.10.14

File hashes

Hashes for merlin_vlm-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8399fb238254d4dfac82716a299f69cfeaf4a4fca157c85c0a77c30adc8ca903
MD5 f13b77c212ad72fe9e788f4c795292e1
BLAKE2b-256 ced081234d07e205c06c46c0ca9420cf1b83c02efb82446b308ef0b967c45625

See more details on using hashes here.

File details

Details for the file merlin_vlm-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: merlin_vlm-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.0 CPython/3.10.14

File hashes

Hashes for merlin_vlm-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c9675028940b48e9150f3b982a431657b07c3b5e8808cc320338fa34bd86a126
MD5 354bd6b3c15bc74a074b493ca7869e39
BLAKE2b-256 8d7b6e6d41095ee8169d7ad62780246976706db850b86a0ecfdab399965ad4fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page