STAT: Spatial Transcriptomics Analytical agenT - AI-powered platform for spatial omics analysis
Project description
STAT — Spatial Transcriptomics Analytical agenT
Ask in natural language, get a planned, verified, and executed analysis of spatial omics data.
Table of contents
Installation
Stable release from PyPI:
pip install stat-agent
With the full set of analysis skill dependencies (squidpy, scvi-tools, torch, liana, cell2location, …):
pip install "stat-agent[skills]"
Some skills require packages that aren't on PyPI; install separately as needed:
# STAGATE (requires PyG ecosystem wheels matching your torch + CUDA version)
pip install torch_geometric
pip install torch_sparse torch_scatter -f https://data.pyg.org/whl/torch-${TORCH_VER}+${CUDA_VER}.html
pip install git+https://github.com/QIFEIDKN/STAGATE_pyG.git
(Replace ${TORCH_VER} and ${CUDA_VER} with your installed torch/CUDA — e.g. 2.4.1+cu121.)
GPU note: the
torchand CUDA versions should be adjusted to match your hardware. See pytorch.org.
Quick start
Web interface
stat-web # serves on http://localhost:8889
# or
./start_web.sh # also starts a Jupyter Lab alongside
In the UI:
- Enter the path to your dataset directory.
- Configure your LLM provider and paste an API key.
- Click Load Dataset.
- Ask questions in the chat panel:
- "Annotate cell types using the breast-cancer reference."
- "Find spatially variable genes."
- "Show CD8A expression in slice 1."
- "Run RCTD deconvolution and overlay the dominant cell type."
Data format
STAT auto-detects your data layout. Place files in a single directory.
Single-slice
dataset/
├── tissue.h5ad # Required: AnnData with x, y in obs
└── he.tif # Optional: H&E image (pixel coords = cell coords)
Multi-slice
dataset/
├── tissue_slice_0.h5ad
├── he_slice_0.tif
├── tissue_slice_1.h5ad
└── he_slice_1.tif
Multi-omics (gene + protein)
dataset/
├── tissue.h5ad # Gene expression
├── tissue_protein.h5ad # Protein expression
├── he.tif
└── protein_CD3.tif
Coordinate convention. Cell coordinates (x, y) in adata.obs map directly to image pixel (x, y). No coordinate transformation. Note the array indexing swap: image array img[y, x] corresponds to cell (x, y).
Required AnnData fields: adata.obs['x'], adata.obs['y'], and the expression matrix adata.X. adata.obs['celltype'] is optional — annotation skills will populate it.
Built-in skills
Skills are auto-discovered from stat_agent/skills/{slug}/SKILL.md. Each skill carries metadata (modalities, data level, prerequisites) and a templated code body. The current catalog:
Cell type annotation
| Skill | Summary |
|---|---|
| Cell Type Annotation with scANVI | Annotate cell types in spatial transcriptomics data using scANVI transfer learning from a reference scRNA-seq dataset. |
| Fast Cell Type Annotation (Clustering + LLM) | Annotate cell types using unsupervised clustering, marker genes, and LLM-based annotation. |
| Cell Type Annotation via Spatial Mapping (Tangram) | Map single-cell reference annotations onto spatial transcriptomics data using Tangram deep learning alignment. |
Spot deconvolution
| Skill | Summary |
|---|---|
| Cell Type Deconvolution (RCTD) | Perform cell type deconvolution (or annotation on spot) on spatial transcriptomics data (Visium spots) using RCTD with a single-cell refere… |
| Bayesian Cell Type Deconvolution (Cell2location) | Reference-based Bayesian deconvolution of spot-level spatial transcriptomics using Cell2location. |
| Fast Spot Deconvolution (FlashDeconv) | Ultra-fast reference-based cell type deconvolution for spot-level spatial data using FlashDeconv. |
Spatial domains
| Skill | Summary |
|---|---|
| Spatial Domain Detection (SpaGCN) | Identify spatial domains in spot-level spatial transcriptomics data using SpaGCN, integrating gene expression, spatial location, and H&E hi… |
| Spatial Domain Detection (STAGATE) | Identify spatial domains using STAGATE (Spatial-Transcriptomics Graph Attention Auto-Encoder). |
| Spatial Domain Detection (GraphST) | Identify spatial domains in spot-level data using GraphST (Graph Self-supervised Transformer). |
Spatial statistics & niches
| Skill | Summary |
|---|---|
| Spatial Statistics Analysis | Compute spatial statistics including Moran's I (spatial autocorrelation of genes), Ripley's K (spatial point pattern of cell types), co-occ… |
| Neighborhood Enrichment Analysis | Compute neighborhood enrichment z-scores to identify which cell types are spatially co-localized or depleted from each other's neighborhood… |
| Spatial Niche Detection | Identify spatial cellular niches using Harmonics hierarchical model. |
| Spatially Variable Genes (SpatialDE) | Identify spatially variable genes using SpatialDE Gaussian process regression. |
Differential expression & pathway
| Skill | Summary |
|---|---|
| Differential Gene Expression Analysis | Find differentially expressed marker genes between groups using scanpy rank_genes_groups with Wilcoxon test. |
| GO Enrichment Analysis | Find enriched Gene Ontology (GO) terms for a user-provided gene list. |
| Over-Representation & Pathway Enrichment Analysis (ORA) | Test whether a gene list is enriched for specific pathways or gene sets using Over-Representation Analysis (Fisher's exact test). |
| Per-Cell Pathway Activity Scoring (ssGSEA) | Compute per-cell pathway activity scores using single-sample Gene Set Enrichment Analysis (ssGSEA). |
| Two-Group Pathway Enrichment Comparison | Compare pathway / gene-set enrichment between two user-provided gene lists (typically markers of two cell populations, clusters, or conditi… |
Cell-cell communication
| Skill | Summary |
|---|---|
| Cell-Cell Communication Analysis (LIANA+) | Analyze cell-cell communication using LIANA+ to identify significant ligand-receptor interactions between cell types. |
| Cell-Cell Communication Analysis (CellPhoneDB) | Analyze cell-cell communication using CellPhoneDB statistical method to identify significant ligand-receptor interactions between cell type… |
Multi-slice integration
| Skill | Summary |
|---|---|
| Batch Integration (Harmony) | Integrate multiple spatial transcriptomics slices using Harmony batch correction. |
| Batch Integration (BBKNN) | Correct batch effects across multiple slices using BBKNN (Batch Balanced K-Nearest Neighbors). |
| Batch Integration (Scanorama) | Correct batch effects across multiple slices using Scanorama panoramic stitching. |
Slice alignment & registration
| Skill | Summary |
|---|---|
| Spatial Alignment (STalign) | Align two cell-level spatial transcriptomics slices using STalign. |
| Slice Registration (PASTE) | Align multiple spatial transcriptomics slices using PASTE (Probabilistic Alignment of ST Experiments). |
Trajectory inference
| Skill | Summary |
|---|---|
| Pseudotime Trajectory Analysis (Palantir / DPT) | Infer cell developmental trajectories and pseudotime ordering using expression-based methods. |
Adding a new skill. Create stat_agent/skills/<your-slug>/SKILL.md with YAML frontmatter (name, title, description, filter_requirements, prerequisites, optional default_skill), then write the analysis instructions and code template in the body. The registry will pick it up at startup.
LLM providers
STAT supports five providers via a unified LLMBackend. In the web UI's Configure LLM panel, pick a Provider from the dropdown, then type the bare Model ID as it appears at that provider's API — no prefix needed. (Older saved configs that include a prefix like anthropic/… still work for backward compatibility.)
For programmatic use, export the corresponding environment variable before launching stat-web. Every model ID below has been verified end-to-end against the live provider API.
| Provider | Where to get a key | Env var | Default model | Other verified IDs |
|---|---|---|---|---|
| OpenAI | https://platform.openai.com/api-keys | OPENAI_API_KEY |
gpt-5.4 |
gpt-5.5, gpt-4o |
| Anthropic | https://console.anthropic.com/settings/keys | ANTHROPIC_API_KEY |
claude-opus-4-7 |
claude-opus-4-6, claude-sonnet-4-6 |
| Google Gemini | https://aistudio.google.com/app/apikey | GOOGLE_API_KEY |
gemini-3.1-pro-preview |
gemini-2.5-pro |
| DeepSeek | https://platform.deepseek.com/api_keys | DEEPSEEK_API_KEY |
deepseek-v4-pro |
deepseek-v4-flash |
| Poe (multi-model gateway) | https://poe.com/api_key | POE_API_KEY |
claude-sonnet-4.5 |
claude-opus-4.7, gpt-5.5, gemini-3.1-pro, deepseek-v4-pro-el |
Poe caveat.
claude-opus-4.6andclaude-sonnet-4.6on Poe force extended-thinking on the bot side and are not yet supported through STAT — useclaude-opus-4.7instead, or switch to the direct Anthropic provider.
Tip. For long-context analysis (multi-slice integration, large reference profiles), prefer models with 200 k+ context:
claude-opus-4-7,claude-opus-4-6,gpt-5.5,gemini-3.1-pro-preview.
Verify before a long run. Use the Test Connection button in the Configure LLM panel — it sends a one-token round-trip through the same
LLMBackendcode path as the agent and reports the exact error if anything is off.
Reproducing the paper
The analyses, figures, and benchmarks from the STAT paper live in a separate repository: https://github.com/chenyhvvvv/STAT-PaperRepro
License
BSD-3-Clause © STAT contributors.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stat_agent-0.1.25.tar.gz.
File metadata
- Download URL: stat_agent-0.1.25.tar.gz
- Upload date:
- Size: 307.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7877f0f2425b9c8952100cedfe5fa0495911e8a45420563f9057d25d5277570
|
|
| MD5 |
0b781c50c0389b80d3dbd27e0b36b4fb
|
|
| BLAKE2b-256 |
e343228783683dc7a5bbaf60ddc16a54ecd2433bbd4c377429c728462e0da30f
|
Provenance
The following attestation bundles were made for stat_agent-0.1.25.tar.gz:
Publisher:
publish.yml on chenyhvvvv/STAT-agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
stat_agent-0.1.25.tar.gz -
Subject digest:
f7877f0f2425b9c8952100cedfe5fa0495911e8a45420563f9057d25d5277570 - Sigstore transparency entry: 1487864573
- Sigstore integration time:
-
Permalink:
chenyhvvvv/STAT-agent@1e6bb8a3cbebbc2e417a66dab85c9053ef5a7f43 -
Branch / Tag:
refs/tags/v0.1.25 - Owner: https://github.com/chenyhvvvv
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1e6bb8a3cbebbc2e417a66dab85c9053ef5a7f43 -
Trigger Event:
release
-
Statement type:
File details
Details for the file stat_agent-0.1.25-py3-none-any.whl.
File metadata
- Download URL: stat_agent-0.1.25-py3-none-any.whl
- Upload date:
- Size: 344.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72d46aa2d3f20e5008c1af6a921240819f70c9b90c5026e8f6fd3a34e8601a01
|
|
| MD5 |
a02ee4263271c54a5c5a7ced3e042ff9
|
|
| BLAKE2b-256 |
8886c884c9260e2c6bf49816f2982718ba9386724de0468ab23d745f02783f85
|
Provenance
The following attestation bundles were made for stat_agent-0.1.25-py3-none-any.whl:
Publisher:
publish.yml on chenyhvvvv/STAT-agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
stat_agent-0.1.25-py3-none-any.whl -
Subject digest:
72d46aa2d3f20e5008c1af6a921240819f70c9b90c5026e8f6fd3a34e8601a01 - Sigstore transparency entry: 1487864691
- Sigstore integration time:
-
Permalink:
chenyhvvvv/STAT-agent@1e6bb8a3cbebbc2e417a66dab85c9053ef5a7f43 -
Branch / Tag:
refs/tags/v0.1.25 - Owner: https://github.com/chenyhvvvv
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1e6bb8a3cbebbc2e417a66dab85c9053ef5a7f43 -
Trigger Event:
release
-
Statement type: