MOOZY: A Patient-First Foundation Model for Computational Pathology.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yousefkotp

These details have not been verified by PyPI

Project links

Model Weights

Project description

MOOZY: A Patient-First Foundation Model for Computational Pathology

MOOZY: a patient-first foundation model for computational pathology, whole-slide image encoding, and case-level representation learning

MOOZY is a foundation model for computational pathology that treats the patient case, not the individual slide, as the fundamental unit of representation. It encodes one or more whole-slide images (WSIs) into a single 768-dimensional case-level embedding that captures dependencies across all slides from the same patient. Trained entirely on public data with 85.8M parameters (14x smaller than GigaPath), MOOZY outperforms larger models on classification and survival prediction tasks across diverse organs and cancer types.

Quick Start
- Environment Setup
- Using the Output
Method Overview
Training
- Scripts
- SLURM Jobs
Results
Acknowledgment
Citation
License

Quick Start

pip install moozy

Model weights download automatically on first use. No access gates, no manual downloads, no HuggingFace approval.

# Encode a patient case from pre-extracted H5 feature files
moozy encode slide_1.h5 slide_2.h5 --output case_embedding.h5

# Encode directly from raw whole-slide images
moozy encode slide_1.svs slide_2.svs --output case_embedding.h5

Or use the Python API:

from moozy.encoding import run_encoding

run_encoding(
    slide_paths=["slide_1.h5", "slide_2.h5"],
    output_path="case_embedding.h5",
)

The output H5 file contains a 768-d case-level embedding ready for downstream tasks: classification, survival prediction, or retrieval.

All encoding arguments (data, runtime, raw WSI options, mixed precision) are documented in docs/encode.md.

Environment Setup

conda create -n moozy python=3.12 -y
conda activate moozy
pip install moozy

venv

python -m venv moozy-env
source moozy-env/bin/activate
pip install moozy

uv venv moozy-env
source moozy-env/bin/activate
uv pip install moozy

Using the Output

The output is a standard H5 file. Load it with h5py:

import h5py

with h5py.File("case_embedding.h5", "r") as f:
    embedding = f["features"][:]  # (768,) float32 case-level embedding

# Use the embedding for downstream tasks
# e.g., as input to a linear probe, k-NN, MLP probe, or clustering

Method Overview

MOOZY is a two-stage pipeline that first learns slide-level representations through self-supervised learning, then aligns them with clinical meaning through multi-task supervision.

Stage 1: Self-supervised slide encoder. A vision transformer learns context-aware spatial representations from 77,134 unlabeled public histopathology slides (~1.67 billion patches across 23 anatomical sites) using masked self-distillation. No labels are used. The slide encoder captures tissue morphology, spatial context, and inter-region relationships across the whole slide.

Stage 2: Patient-aware multi-task alignment. The pretrained slide encoder is fine-tuned end-to-end with a case transformer that models dependencies across all slides from the same patient. A learnable [CASE] token aggregates per-slide embeddings into a single case-level representation. Multi-task supervision across 333 tasks (205 classification, 128 survival) from 56 public datasets provides broad clinical grounding. All task heads are discarded after training, leaving a general-purpose patient encoder.

For detailed model specifications, see the model card.

Training

Both training stages are fully open-source and reproducible using only public data. All training arguments (data, model, optimization, checkpointing, logging, runtime) are documented in the Stage 1 and Stage 2 training docs.

Scripts

For local multi-GPU training, use the launch scripts in scripts/:

# Stage 1: Self-supervised pretraining
GPU_IDS=0,1,2,3,4,5,6,7 bash scripts/train_stage1.sh

# Stage 2: Multi-task alignment
GPU_IDS=0,1,2,3,4,5,6,7 bash scripts/train_stage2.sh

SLURM Jobs

SLURM job templates are provided in slurm/ for cluster environments:

Script	Description
`slurm/single_gpu.sh`	Single-GPU training
`slurm/multi_gpu.sh`	Multi-GPU training on one node
`slurm/multi_node.sh`	Multi-node distributed training
`slurm/inference.sh`	Patient encoding

Results

Frozen-feature MLP probe comparison against slide encoder baselines on eight held-out tasks. Bold indicates the best result per metric.

Task	Metric	CHIEF	GigaPath	PRISM	Madeleine	TITAN	MOOZY
Residual Cancer Burden	F1	0.46	0.45	0.46	0.51	0.43	0.56
	AUC	0.60	0.55	0.58	0.63	0.58	0.74
	Bal Acc	0.44	0.40	0.43	0.48	0.38	0.51
TP53 Mutation	F1	0.82	0.76	0.85	0.84	0.87	0.87
	AUC	0.81	0.76	0.85	0.85	0.91	0.86
	Bal Acc	0.83	0.76	0.84	0.84	0.88	0.86
BAP1 Mutation	F1	0.86	0.84	0.80	0.85	0.84	0.89
	AUC	0.75	0.63	0.71	0.78	0.82	0.79
	Bal Acc	0.75	0.66	0.66	0.75	0.75	0.78
ACVR2A Mutation	F1	0.89	0.80	0.85	0.89	0.87	0.91
	AUC	0.80	0.74	0.83	0.76	0.79	0.91
	Bal Acc	0.80	0.65	0.81	0.81	0.76	0.90
Histologic Grade	F1	0.71	0.77	0.73	0.75	0.73	0.78
	AUC	0.71	0.77	0.67	0.74	0.71	0.75
	Bal Acc	0.73	0.77	0.73	0.74	0.73	0.77
KRAS Mutation	F1	0.77	0.77	0.72	0.81	0.80	0.85
	AUC	0.76	0.72	0.61	0.70	0.80	0.80
	Bal Acc	0.74	0.76	0.63	0.77	0.81	0.79
IDH Status	F1	0.92	0.94	0.91	0.92	0.94	0.97
	AUC	0.96	0.97	0.95	0.96	0.97	0.99
	Bal Acc	0.92	0.94	0.91	0.91	0.94	0.97
Treatment Response	F1	0.53	0.51	0.57	0.49	0.49	0.58
	AUC	0.70	0.68	0.69	0.59	0.60	0.68
	Bal Acc	0.48	0.40	0.51	0.35	0.37	0.48

_{Mean values from five-fold frozen-feature evaluation. Full results with confidence intervals are in the paper.}

Across all eight tasks, MOOZY improves macro averages over TITAN by +7.4% weighted F1, +5.5% AUC, and +7.8% balanced accuracy, and over PRISM by +8.8% F1, +10.7% AUC, and +9.8% balanced accuracy, with 14x fewer parameters than GigaPath.

Acknowledgment

This work was supported by NSERC-DG RGPIN-2022-05378 [M.S.H], Amazon Research Award [M.S.H], and Gina Cody RIF [M.S.H], FRQNT scholarship [Y.K]. Computational resources were provided in part by Calcul Québec and the Digital Research Alliance of Canada.

Citation

If you find MOOZY useful, please cite:

@misc{kotp2026moozypatientfirstfoundationmodel,
      title={MOOZY: A Patient-First Foundation Model for Computational Pathology},
      author={Yousef Kotp and Vincent Quoc-Huy Trinh and Christopher Pal and Mahdi S. Hosseini},
      year={2026},
      eprint={2603.27048},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.27048},
}

License

This project is licensed under CC BY-NC-SA 4.0.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yousefkotp

These details have not been verified by PyPI

Project links

Model Weights

Release history Release notifications | RSS feed

This version

0.1.0

Mar 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moozy-0.1.0.tar.gz (103.1 kB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

moozy-0.1.0-py3-none-any.whl (125.6 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file moozy-0.1.0.tar.gz.

File metadata

Download URL: moozy-0.1.0.tar.gz
Upload date: Mar 31, 2026
Size: 103.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for moozy-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`51b30d9624af57e3857ec16c78411a562fd5a51687a9db714105f99cf6d30e41`
MD5	`ac0c8d05050ec166ea4089fe0382ff70`
BLAKE2b-256	`9f2918606f81a7a2eea3df3e0f40364d57cc492a51ad925b8a01fc28af0e6d9a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for moozy-0.1.0.tar.gz:

Publisher: publish.yml on AtlasAnalyticsLab/MOOZY

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: moozy-0.1.0.tar.gz
- Subject digest: 51b30d9624af57e3857ec16c78411a562fd5a51687a9db714105f99cf6d30e41
- Sigstore transparency entry: 1202396633
- Sigstore integration time: Mar 31, 2026
Source repository:
- Permalink: AtlasAnalyticsLab/MOOZY@4ff06267b60a3c994136f9252c903ad2b4e630fd
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/AtlasAnalyticsLab
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4ff06267b60a3c994136f9252c903ad2b4e630fd
- Trigger Event: release

File details

Details for the file moozy-0.1.0-py3-none-any.whl.

File metadata

Download URL: moozy-0.1.0-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 125.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for moozy-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`013aeebe99bfb48ab0f60e8e9be071c475ad2f34a5e252221f8723b5d5dd57a5`
MD5	`a4b668d88c5c382856245c660d011239`
BLAKE2b-256	`ed8711f69ca60d94e6b369802be0339509e9bbe31002947379644e31d5f7f3fd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for moozy-0.1.0-py3-none-any.whl:

Publisher: publish.yml on AtlasAnalyticsLab/MOOZY

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: moozy-0.1.0-py3-none-any.whl
- Subject digest: 013aeebe99bfb48ab0f60e8e9be071c475ad2f34a5e252221f8723b5d5dd57a5
- Sigstore transparency entry: 1202396634
- Sigstore integration time: Mar 31, 2026
Source repository:
- Permalink: AtlasAnalyticsLab/MOOZY@4ff06267b60a3c994136f9252c903ad2b4e630fd
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/AtlasAnalyticsLab
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4ff06267b60a3c994136f9252c903ad2b4e630fd
- Trigger Event: release

moozy 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MOOZY: A Patient-First Foundation Model for Computational Pathology

Table of Contents

Quick Start

Environment Setup

Using the Output

Method Overview

Training

Scripts

SLURM Jobs

Results

Acknowledgment

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance