COMPASS: Generalizable AI predicts immunotherapy outcomes across cancers and treatments.
Project description
COMPASS Reproducibility (Pretraining from Scratch & Downstream Fine-Tuning)
This branch provides a fully reproducible pipeline for the COMPASS model, including pretraining from scratch and fine-tuning on downstream immunotherapy response datasets.
🧩 Step 1: Download Pretraining Datasets
TCGA Dataset
This dataset contains preprocessed TCGA transcriptomic profiles used for COMPASS pretraining. To facilitate reproducibility and efficient execution, we provide an immune-focused subset of 2,475 genes, which is sufficient to run all pretraining scripts in this repository.
After downloading the dataset from Figshare, please organize the files under the data/ directory with the following structure:
data/
└── TCGA/
├── GENE.TABLE
├── TCGA.PATIENT.PROCESSED.TABLE
├── TCGA.PATIENT.TABLE
└── TCGA.TPM.TABLE
ITRP Dataset (Alternative / Downstream Fine-Tuning)
The ITRP.zip archive contains two serialized pandas tables:
ITRP.TPM.TABLE— gene-level RNA-seq TPM matrixITRP.PATIENT.TABLE— patient metadata (cancer type, therapy, response labels)
This dataset integrates 1,133 patients from 16 immunotherapy cohorts, all standardized using the COMPASS preprocessing pipeline.
Reproducing Datasets from Raw Data (Optional)
If you prefer to regenerate the datasets from raw sources, please refer to:
-
TCGA preprocessing pipeline https://github.com/mims-harvard/COMPASS-web/tree/main/TCGA_dataset_processing
-
ITRP mRNA pipeline https://github.com/mims-harvard/COMPASS-web/tree/main/mRNA_pipeline
🧠 Step 2: Install COMPASS
# IMPORTANT:
# If you are pretraining COMPASS from scratch,
# you MUST use this specific version
pip install immuno-compass==2.0.4
⚙️ Step 3: Run Pretraining from Scratch
Go to the run_scripts folder, Open and execute the following notebook:
01_pretraining.ipynb
Note The example notebook uses the TCGA-2475 gene subset for faster execution and reduced GPU memory usage.
🔬 Step 4: Run Downstream Fine-Tuning
You can either run the notebooks interactively or execute them sequentially via scripts.
Below is an example using nbconvert (tested on V100 GPU):
jupyter nbconvert --to notebook --execute 01_loco_nft.ipynb --output 01_loco_nft.ipynb
jupyter nbconvert --to notebook --execute 02_loco_lft.ipynb --output 02_loco_lft.ipynb
jupyter nbconvert --to notebook --execute 03_loco_pft.ipynb --output 03_loco_pft.ipynb
jupyter nbconvert --to notebook --execute 04_loco_fft.ipynb --output 04_loco_fft.ipynb
jupyter nbconvert --to notebook --execute 05_loco_lgr.ipynb --output 05_loco_lgr.ipynb
jupyter nbconvert --to notebook --execute 06_analysis_loco.ipynb --output 06_analysis_loco.ipynb
📌 Notes
- This repository is intended for methodological reproducibility, not for matching a single reported checkpoint.
- For close reproducibility, use the same weight initialization and document the GPU and PyTorch versions.
- Minor numerical differences may occur due to hardware, CUDA versions, or random seeds.
- For best reproducibility, fix random seeds and document GPU / PyTorch versions.
- Minor numerical differences may occur due to variations in hardware setup, CUDA version and GPU configuration, or ML weight initialization.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file immuno_compass-2.0.4.tar.gz.
File metadata
- Download URL: immuno_compass-2.0.4.tar.gz
- Upload date:
- Size: 174.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c43a17fbaecb97dbcac48333e8a9856a6ee1a795bc98f5f3b59a3850053f5f7c
|
|
| MD5 |
254938c8a29b20efb28f23bec38c3cfa
|
|
| BLAKE2b-256 |
a2f8aec424b493c9ae09485b661a0274559f58a5adb9864dd0d95ec5f3ed9430
|
File details
Details for the file immuno_compass-2.0.4-py3-none-any.whl.
File metadata
- Download URL: immuno_compass-2.0.4-py3-none-any.whl
- Upload date:
- Size: 196.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98068d87566f508b00492261507d375d850843e69a1c30e2e24c1063c29b5721
|
|
| MD5 |
0386eae99b6af8d2afd870b4ded603de
|
|
| BLAKE2b-256 |
723b6f0ede0f1ebbcf1c49940c45c03346a28b36505d5ebf288f90aab80ca76a
|