Spatial heterogeneity profiling of immune checkpoints in spatial transcriptomics
Project description
SpatialCheckpoint
Spatial heterogeneity profiling of immune checkpoints in spatial transcriptomics data.
SpatialCheckpoint is a bioinformatics pipeline that integrates spatial gene expression profiling, consensus clustering, ensemble ML classification, SHAP interpretability, and clinical survival analysis to characterize immune checkpoint heterogeneity across the tumor microenvironment.
Installation
pip install spatialcheckpoint
Requirements: Python ≥ 3.10
CLI
# Run the built-in demo (no data files needed)
spatialcheckpoint demo
# Download a registered dataset
spatialcheckpoint download BRCA_visium_10x
# Download all BRCA datasets
spatialcheckpoint download all --cancer-type BRCA
# Preprocess raw Visium output or H5AD
spatialcheckpoint preprocess path/to/spaceranger/ data/processed/
spatialcheckpoint preprocess sample.h5ad data/processed/
# Run full spatial analysis on a preprocessed sample
spatialcheckpoint analyze sample01
# Discover archetypes from a feature matrix CSV
spatialcheckpoint discover results/sample01/features.csv --k-min 2 --k-max 8
# Train the archetype classifier
spatialcheckpoint classify features.csv archetype_labels.csv --model-dir models/
# Generate publication figures
spatialcheckpoint figures --results-dir results/ --output-dir paper/figures/
Usage
Gene Panel
import spatialcheckpoint as scp
# 44 checkpoint genes across 6 functional categories
genes = scp.get_all_checkpoint_genes()
# Genes by category
pd1_pathway = scp.get_category_genes("co_inhibitory_receptors")
novel = scp.get_category_genes("novel_checkpoints")
cell_markers = scp.get_immune_cell_markers() # {cell_type: [genes]}
lr_pairs = scp.get_ligand_receptor_pairs() # [{ligand, receptor, alias}]
Data Preprocessing
# From Space Ranger output directory
preprocessor = scp.SpatialDataPreprocessor(spaceranger_out_path="path/to/spaceranger/output")
adata = preprocessor.load_visium()
adata = preprocessor.quality_control(adata, min_genes=200, max_mt_pct=25.0)
adata = preprocessor.normalize(adata)
adata.write_h5ad("data/processed/sample01_preprocessed.h5ad")
# Or from an existing H5AD
preprocessor = scp.SpatialDataPreprocessor(h5_path="existing_data.h5ad")
Spatial Profiling & Feature Extraction
genes = scp.get_all_checkpoint_genes()
# Region-based expression (tumor_core, invasive_margin, stroma, …)
profiler = scp.SpatialCheckpointProfiler(adata, genes)
region_expr = profiler.expression_by_region()
hotspots = profiler.checkpoint_hotspot_detection() # Moran's I per gene
# 80+ spatial features per slide
engineer = scp.SpatialFeatureEngineer(adata, genes)
features = engineer.extract_all_features(sample_id="sample01")
Archetype Discovery
# feature_matrix: DataFrame (n_samples × n_features)
# sample_metadata: DataFrame with 'cancer_type' column, same index
discovery = scp.SpatialArchetypeDiscovery(feature_matrix, sample_metadata)
cc = discovery.consensus_clustering(k_range=(2, 8), n_iterations=100)
labels = cc["labels"]
char = discovery.characterize_archetypes(labels)
nmf = discovery.run_nmf(k=cc["optimal_k"])
# nmf["W"] → (n_samples, k) soft membership weights
# nmf["H"] → (k, n_features) archetype profiles
Classifier Training & SHAP
trainer = scp.ArchetypeModelTrainer(
feature_matrix=feature_matrix,
archetype_labels=labels,
output_dir="models/",
)
results = trainer.run(n_optuna_trials=30)
explainer = scp.ArchetypeExplainer(results["model"], feature_matrix)
shap_df = explainer.global_feature_importance()
Demo
Runs entirely on synthetic data — no Visium files required.
import numpy as np
import pandas as pd
import scanpy as sc
import spatialcheckpoint as scp
print(f"SpatialCheckpoint v{scp.__version__}")
# ── Gene panel ───────────────────────────────────────────────────────────────
genes = scp.get_all_checkpoint_genes()
print(f"Checkpoint panel : {len(genes)} genes")
print(f"PD-1 pathway : {scp.get_category_genes('co_inhibitory_receptors')}")
lr = scp.get_ligand_receptor_pairs()
print(f"LR pairs ({len(lr)}) e.g. {lr[0]}")
# ── Synthetic Visium slide ───────────────────────────────────────────────────
rng = np.random.default_rng(42)
cp8 = genes[:8]
dummy_genes = [f"GENE{i:04d}" for i in range(92)] + cp8
X = rng.negative_binomial(2, 0.5, size=(200, 100)).astype(float)
adata = sc.AnnData(X=X)
adata.var_names = pd.Index(dummy_genes)
gx, gy = np.meshgrid(np.arange(20), np.arange(10))
coords = np.column_stack([gx.ravel(), gy.ravel()]).astype(float)
coords += rng.uniform(-0.1, 0.1, size=coords.shape)
adata.obsm["spatial"] = coords
region_map = []
for x, y in coords:
if x < 5 and y < 5: region_map.append("tumor_core")
elif x < 10 and y < 8: region_map.append("invasive_margin")
elif x >= 15: region_map.append("immune_enriched")
elif y >= 8: region_map.append("necrotic")
else: region_map.append("stroma")
adata.obs["region_type"] = pd.Categorical(
region_map,
categories=["tumor_core","invasive_margin","stroma","immune_enriched","necrotic"]
)
# ── Spatial feature extraction ───────────────────────────────────────────────
engineer = scp.SpatialFeatureEngineer(adata, cp8)
features = engineer.extract_all_features(sample_id="demo")
print(f"\nFeature matrix : {features.shape[1]} features extracted")
# ── Multi-sample archetype discovery ────────────────────────────────────────
feat_mat = pd.DataFrame(
rng.standard_normal((30, features.shape[1])),
index=[f"sample_{i:03d}" for i in range(30)],
columns=features.columns,
)
meta = pd.DataFrame(
{"cancer_type": rng.choice(["BRCA","CRC","NSCLC"], 30)},
index=feat_mat.index,
)
discovery = scp.SpatialArchetypeDiscovery(feat_mat, meta)
cc = discovery.consensus_clustering(k_range=(2, 4), n_iterations=20)
print(f"\nOptimal k : {cc['optimal_k']}")
char = discovery.characterize_archetypes(cc["labels"])
print("\nArchetype summary:")
print(char[["archetype_name","n_samples"]].to_string())
# ── NMF soft membership ──────────────────────────────────────────────────────
nmf = discovery.run_nmf(k=cc["optimal_k"])
print(f"\nNMF explained variance : {nmf['explained_variance']:.3f}")
print("Membership weights (first 3 samples):")
print(nmf["W"].head(3).round(3).to_string())
Gene Panel
| Category | Genes (examples) |
|---|---|
| Co-inhibitory receptors | PDCD1 (PD-1), CTLA4, LAG3, HAVCR2 (TIM-3), TIGIT |
| Co-inhibitory ligands | CD274 (PD-L1), PDCD1LG2 (PD-L2), LGALS9 |
| Novel checkpoints | VSIR (VISTA), CD276 (B7-H3), VTCN1 (B7-H4) |
| Innate checkpoints | CD47, SIRPA, LILRB1, LILRB2 |
| Immune enzymes | IDO1, ENTPD1 (CD39), NT5E (CD73), ARG1 |
| Co-stimulatory reference | CD28, ICOS, TNFRSF4 (OX40), TNFRSF9 (4-1BB) |
Archetypes
| Archetype | Spatial signature |
|---|---|
Checkpoint-Hot |
High checkpoint + high immune + co-localized |
Checkpoint-Cold |
Low checkpoint + low immune infiltration |
Checkpoint-Excluded |
Checkpoint at margin, immune at periphery |
Checkpoint-Mismatch |
Checkpoint and immune spatially separated |
Innate-Dominant |
CD47/SIRPα axis dominant |
Novel-Enriched |
VISTA / B7-H3 / B7-H4 enriched |
Pipeline Architecture
Raw Visium data (Space Ranger dir or H5AD)
→ SpatialDataPreprocessor QC, normalize → 'counts' / 'log1p' layers
→ SpatialCheckpointProfiler region expression (tumor_core, invasive_margin,
stroma, immune_enriched, necrotic)
→ SpatialFeatureEngineer 80+ features: co-localization, gradients,
Moran's I, region expression ratios
→ SpatialArchetypeDiscovery consensus KMeans + delta-area k-selection + NMF
→ ArchetypeModelTrainer LightGBM + XGBoost + MLP + RF ensemble,
SMOTE, RFECV, Optuna HPO
→ ArchetypeExplainer SHAP global / per-class feature importance
→ ClinicalAssociationAnalyzer KM curves, Cox PH, logistic regression (OS/PFS)
Output Files
| Path | Contents |
|---|---|
results/{sample_id}/features.csv |
80+ spatial features |
results/{sample_id}/region_expression.csv |
Region × gene expression |
results/{sample_id}/hotspots.csv |
Moran's I per gene |
results/{sample_id}/colocalization.csv |
Ligand-receptor co-occurrence |
results/archetypes/archetype_labels.csv |
Sample → archetype assignment |
results/archetypes/nmf_W.csv, nmf_H.csv |
NMF basis / coefficient matrices |
models/archetype_classifier.joblib |
Serialized ensemble model |
paper/figures/ |
Publication-ready PDF/PNG plots |
Citation
@article{spatialcheckpoint2025,
title = {SpatialCheckpoint: Spatial heterogeneity profiling of immune checkpoints
in spatial transcriptomics},
author = {},
journal = {},
year = {2025},
}
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spatialcheckpoint-0.1.2.tar.gz.
File metadata
- Download URL: spatialcheckpoint-0.1.2.tar.gz
- Upload date:
- Size: 96.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
30d50053d5680bafc6429918dded23b9c71751d9d9eeedd612e98626c64c97a7
|
|
| MD5 |
06ea2a56091377b30ca55f793cb697d3
|
|
| BLAKE2b-256 |
4171c20f9d29526a629ed1281ac93e8a61b1cd15cdf7e02e5a7a18e490839e00
|
File details
Details for the file spatialcheckpoint-0.1.2-py3-none-any.whl.
File metadata
- Download URL: spatialcheckpoint-0.1.2-py3-none-any.whl
- Upload date:
- Size: 99.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a151778e01029b6604dd5656a6091d9ab950700099c71e1d1f721a5993d6e27e
|
|
| MD5 |
9afc528e989e08e97499bf3224b3caed
|
|
| BLAKE2b-256 |
28a36867fdb31b7e160a62d1547963bfd4331a8c1fbcdadaa7a1a1e35e67a1e5
|