Skip to main content

Spatial heterogeneity profiling of immune checkpoints in spatial transcriptomics

Project description

SpatialCheckpoint

PyPI version Python License: MIT

Spatial heterogeneity profiling of immune checkpoints in spatial transcriptomics data.

SpatialCheckpoint is a bioinformatics pipeline that integrates spatial gene expression profiling, consensus clustering, ensemble ML classification, SHAP interpretability, and clinical survival analysis to characterize immune checkpoint heterogeneity across the tumor microenvironment.


Installation

pip install spatialcheckpoint

Requirements: Python ≥ 3.10


CLI

# Run the built-in demo (no data files needed)
spatialcheckpoint demo

# Download a registered dataset
spatialcheckpoint download BRCA_visium_10x

# Download all BRCA datasets
spatialcheckpoint download all --cancer-type BRCA

# Preprocess raw Visium output or H5AD
spatialcheckpoint preprocess path/to/spaceranger/  data/processed/
spatialcheckpoint preprocess sample.h5ad           data/processed/

# Run full spatial analysis on a preprocessed sample
spatialcheckpoint analyze sample01

# Discover archetypes from a feature matrix CSV
spatialcheckpoint discover results/sample01/features.csv --k-min 2 --k-max 8

# Train the archetype classifier
spatialcheckpoint classify features.csv archetype_labels.csv --model-dir models/

# Generate publication figures
spatialcheckpoint figures --results-dir results/ --output-dir paper/figures/

Usage

Gene Panel

import spatialcheckpoint as scp

# 44 checkpoint genes across 6 functional categories
genes = scp.get_all_checkpoint_genes()

# Genes by category
pd1_pathway  = scp.get_category_genes("co_inhibitory_receptors")
novel        = scp.get_category_genes("novel_checkpoints")
cell_markers = scp.get_immune_cell_markers()    # {cell_type: [genes]}
lr_pairs     = scp.get_ligand_receptor_pairs()  # [{ligand, receptor, alias}]

Data Preprocessing

# From Space Ranger output directory
preprocessor = scp.SpatialDataPreprocessor(spaceranger_out_path="path/to/spaceranger/output")
adata = preprocessor.load_visium()
adata = preprocessor.quality_control(adata, min_genes=200, max_mt_pct=25.0)
adata = preprocessor.normalize(adata)
adata.write_h5ad("data/processed/sample01_preprocessed.h5ad")

# Or from an existing H5AD
preprocessor = scp.SpatialDataPreprocessor(h5_path="existing_data.h5ad")

Spatial Profiling & Feature Extraction

genes = scp.get_all_checkpoint_genes()

# Region-based expression (tumor_core, invasive_margin, stroma, …)
profiler    = scp.SpatialCheckpointProfiler(adata, genes)
region_expr = profiler.expression_by_region()
hotspots    = profiler.checkpoint_hotspot_detection()   # Moran's I per gene

# 80+ spatial features per slide
engineer = scp.SpatialFeatureEngineer(adata, genes)
features = engineer.extract_all_features(sample_id="sample01")

Archetype Discovery

# feature_matrix: DataFrame (n_samples × n_features)
# sample_metadata: DataFrame with 'cancer_type' column, same index
discovery = scp.SpatialArchetypeDiscovery(feature_matrix, sample_metadata)

cc     = discovery.consensus_clustering(k_range=(2, 8), n_iterations=100)
labels = cc["labels"]
char   = discovery.characterize_archetypes(labels)

nmf = discovery.run_nmf(k=cc["optimal_k"])
# nmf["W"]  →  (n_samples, k) soft membership weights
# nmf["H"]  →  (k, n_features) archetype profiles

Classifier Training & SHAP

trainer = scp.ArchetypeModelTrainer(
    feature_matrix=feature_matrix,
    archetype_labels=labels,
    output_dir="models/",
)
results = trainer.run(n_optuna_trials=30)

explainer = scp.ArchetypeExplainer(results["model"], feature_matrix)
shap_df   = explainer.global_feature_importance()

Demo

Runs entirely on synthetic data — no Visium files required.

import numpy as np
import pandas as pd
import scanpy as sc
import spatialcheckpoint as scp

print(f"SpatialCheckpoint v{scp.__version__}")

# ── Gene panel ───────────────────────────────────────────────────────────────
genes = scp.get_all_checkpoint_genes()
print(f"Checkpoint panel : {len(genes)} genes")
print(f"PD-1 pathway     : {scp.get_category_genes('co_inhibitory_receptors')}")

lr = scp.get_ligand_receptor_pairs()
print(f"LR pairs ({len(lr)}) e.g. {lr[0]}")

# ── Synthetic Visium slide ───────────────────────────────────────────────────
rng = np.random.default_rng(42)
cp8 = genes[:8]
dummy_genes = [f"GENE{i:04d}" for i in range(92)] + cp8

X = rng.negative_binomial(2, 0.5, size=(200, 100)).astype(float)
adata = sc.AnnData(X=X)
adata.var_names = pd.Index(dummy_genes)

gx, gy = np.meshgrid(np.arange(20), np.arange(10))
coords = np.column_stack([gx.ravel(), gy.ravel()]).astype(float)
coords += rng.uniform(-0.1, 0.1, size=coords.shape)
adata.obsm["spatial"] = coords

region_map = []
for x, y in coords:
    if   x < 5  and y < 5:  region_map.append("tumor_core")
    elif x < 10 and y < 8:  region_map.append("invasive_margin")
    elif x >= 15:            region_map.append("immune_enriched")
    elif y >= 8:             region_map.append("necrotic")
    else:                    region_map.append("stroma")

adata.obs["region_type"] = pd.Categorical(
    region_map,
    categories=["tumor_core","invasive_margin","stroma","immune_enriched","necrotic"]
)

# ── Spatial feature extraction ───────────────────────────────────────────────
engineer = scp.SpatialFeatureEngineer(adata, cp8)
features = engineer.extract_all_features(sample_id="demo")
print(f"\nFeature matrix   : {features.shape[1]} features extracted")

# ── Multi-sample archetype discovery ────────────────────────────────────────
feat_mat = pd.DataFrame(
    rng.standard_normal((30, features.shape[1])),
    index=[f"sample_{i:03d}" for i in range(30)],
    columns=features.columns,
)
meta = pd.DataFrame(
    {"cancer_type": rng.choice(["BRCA","CRC","NSCLC"], 30)},
    index=feat_mat.index,
)

discovery = scp.SpatialArchetypeDiscovery(feat_mat, meta)
cc = discovery.consensus_clustering(k_range=(2, 4), n_iterations=20)
print(f"\nOptimal k        : {cc['optimal_k']}")

char = discovery.characterize_archetypes(cc["labels"])
print("\nArchetype summary:")
print(char[["archetype_name","n_samples"]].to_string())

# ── NMF soft membership ──────────────────────────────────────────────────────
nmf = discovery.run_nmf(k=cc["optimal_k"])
print(f"\nNMF explained variance : {nmf['explained_variance']:.3f}")
print("Membership weights (first 3 samples):")
print(nmf["W"].head(3).round(3).to_string())

Gene Panel

Category Genes (examples)
Co-inhibitory receptors PDCD1 (PD-1), CTLA4, LAG3, HAVCR2 (TIM-3), TIGIT
Co-inhibitory ligands CD274 (PD-L1), PDCD1LG2 (PD-L2), LGALS9
Novel checkpoints VSIR (VISTA), CD276 (B7-H3), VTCN1 (B7-H4)
Innate checkpoints CD47, SIRPA, LILRB1, LILRB2
Immune enzymes IDO1, ENTPD1 (CD39), NT5E (CD73), ARG1
Co-stimulatory reference CD28, ICOS, TNFRSF4 (OX40), TNFRSF9 (4-1BB)

Archetypes

Archetype Spatial signature
Checkpoint-Hot High checkpoint + high immune + co-localized
Checkpoint-Cold Low checkpoint + low immune infiltration
Checkpoint-Excluded Checkpoint at margin, immune at periphery
Checkpoint-Mismatch Checkpoint and immune spatially separated
Innate-Dominant CD47/SIRPα axis dominant
Novel-Enriched VISTA / B7-H3 / B7-H4 enriched

Pipeline Architecture

Raw Visium data (Space Ranger dir or H5AD)
  → SpatialDataPreprocessor      QC, normalize → 'counts' / 'log1p' layers
  → SpatialCheckpointProfiler    region expression (tumor_core, invasive_margin,
                                  stroma, immune_enriched, necrotic)
  → SpatialFeatureEngineer       80+ features: co-localization, gradients,
                                  Moran's I, region expression ratios
  → SpatialArchetypeDiscovery    consensus KMeans + delta-area k-selection + NMF
  → ArchetypeModelTrainer        LightGBM + XGBoost + MLP + RF ensemble,
                                  SMOTE, RFECV, Optuna HPO
  → ArchetypeExplainer           SHAP global / per-class feature importance
  → ClinicalAssociationAnalyzer  KM curves, Cox PH, logistic regression (OS/PFS)

Output Files

Path Contents
results/{sample_id}/features.csv 80+ spatial features
results/{sample_id}/region_expression.csv Region × gene expression
results/{sample_id}/hotspots.csv Moran's I per gene
results/{sample_id}/colocalization.csv Ligand-receptor co-occurrence
results/archetypes/archetype_labels.csv Sample → archetype assignment
results/archetypes/nmf_W.csv, nmf_H.csv NMF basis / coefficient matrices
models/archetype_classifier.joblib Serialized ensemble model
paper/figures/ Publication-ready PDF/PNG plots

Citation

@article{spatialcheckpoint2025,
  title   = {SpatialCheckpoint: Spatial heterogeneity profiling of immune checkpoints
             in spatial transcriptomics},
  author  = {},
  journal = {},
  year    = {2025},
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spatialcheckpoint-0.1.2.tar.gz (96.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spatialcheckpoint-0.1.2-py3-none-any.whl (99.0 kB view details)

Uploaded Python 3

File details

Details for the file spatialcheckpoint-0.1.2.tar.gz.

File metadata

  • Download URL: spatialcheckpoint-0.1.2.tar.gz
  • Upload date:
  • Size: 96.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for spatialcheckpoint-0.1.2.tar.gz
Algorithm Hash digest
SHA256 30d50053d5680bafc6429918dded23b9c71751d9d9eeedd612e98626c64c97a7
MD5 06ea2a56091377b30ca55f793cb697d3
BLAKE2b-256 4171c20f9d29526a629ed1281ac93e8a61b1cd15cdf7e02e5a7a18e490839e00

See more details on using hashes here.

File details

Details for the file spatialcheckpoint-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for spatialcheckpoint-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a151778e01029b6604dd5656a6091d9ab950700099c71e1d1f721a5993d6e27e
MD5 9afc528e989e08e97499bf3224b3caed
BLAKE2b-256 28a36867fdb31b7e160a62d1547963bfd4331a8c1fbcdadaa7a1a1e35e67a1e5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page