Python package for spatial transcriptomics data analysis

These details have not been verified by PyPI

Project links

Homepage

Project description

Static Badge PyPI - Version GitHub repo size PyPI - Downloads GitHub Issues or Pull Requests GitHub Repo stars

📖 Documents | 🚀 Tutorial | 💬 Contact me

Core Concept: Discrete to Continuous

Please ⭐Star STMiner on Github if you find it's useful, thank you!😊

👩‍🏫 Introduction

Why STMiner?

ST data presents challenges such as uneven cell density distribution, low sampling rates, and complex spatial structures. Traditional spot-based analysis strategies struggle to effectively address these issues. STMiner explores ST data by leveraging the spatial distribution of genes, thus avoiding the biases that these conditions can introduce into the results.

Most importantly, STMiner offers seamless integration with Anndata / Scanpy and can be easily installed via PyPI.

Method detail

Here we propose “STMiner”. The three key steps of analyzing ST data in STMiner are depicted.

(Left top) STMiner first utilizes Gaussian Mixture Models (GMMs) to represent the spatial distribution of each gene and the overall spatial distribution. (Left bottom) STMiner then identifies spatially variable genes by calculating the cost that transfers the overall spatial distribution to gene spatial distribution. Genes with high costs exhibit significant spatial variation, meaning their expression patterns differ considerably across different regions of the tissue. The distance array is built between SVGs in the same way, genes with similar spatial structures have a low cost to transport to each other, and vice versa. (Right) The distance array is embedded into a low-dimensional space by Multidimensional Scaling, allowing for clustering genes with similar spatial expression patterns into distinct functional gene sets and getting their spatial structure.

🚀 Quick start by example

Please visit STMiner Documents for installation and detail usage.

💡💡💡 We also provide a step-by-step Jupyter Notebook file to reproduce the results. You can access it here.

Additionally, you can run STMiner on your own:

Import package

from STMiner import SPFinder

Load data

You can download them from STMiner-test-data. You can also download the raw dataset from GEO, STMiner can read spatial transcriptome data in various formats, such as gem, bmk, and h5ad ( see STMiner Documents).
We recommend using the h5ad format, as it is currently the most widely used and supported by most algorithms and software in the spatial transcriptomics field.

sp = SPFinder()
file_path = 'Path/to/your/h5ad/file'
sp.read_h5ad(file=file_path, bin_size=1)

For high resolution ST data. You can increase the bin_size parameter when loading the data to reduce running time, for example:

sp.read_h5ad(file=file_path, bin_size=50, merge_bin=True)

This will not reduce the accuracy of STMiner, as resolution generally does not affect the spatial distribution of genes.

Find spatial high variable genes

sp.get_genes_csr_array(min_cells=100, log1p=False,, normalize = False, vmax = 100)
sp.spatial_high_variable_genes(thread=6)

The parameter min_cells was used to filter genes that are too sparse to generate a reliable spatial distribution.
The parameter log1p was used to avoid extreme values affecting the results. For most open-source h5ad files, log1p has already been executed, so the default value here is False.
You can perform STMiner in your interested gene sets. Use parameter gene_list to input the gene list to STMiner. Then, STMiner will only calculate the given gene set of the dataset.

You can check the distance of each gene by:

sp.global_distance

Gene	Distance	z-score
myha	1.35E+08	2.771493
vmhcl	1.01E+08	2.470881
zgc:101560	9.95E+07	2.458787
pvalb1	9.82E+07	2.445257
myhz2	9.75E+07	2.437787
...	...	...
rps17	2.61E+05	-3.63207
rpl13	2.48E+05	-3.68506
rpl32	2.43E+05	-3.70327
rsl24d1	2.27E+05	-3.7757
rpl22	1.83E+05	-3.99332

The 'Gene' column is the gene name, and the 'Distance' column is the difference between the spatial distribution of the gene and the background.
A larger difference indicates a more pronounced spatial pattern of the gene.

Preprocess and Fit GMM

sp.fit_pattern(n_comp=10, gene_list=list(sp.global_distance[:2000]['Gene']))

n_comp=20 means each GMM model has 20 components.

Build distance matrix & clustering

# This step calculates the distance between genes' spatial distributions.
sp.build_distance_array()
# Dimensionality reduction and clustering.
sp.cluster_gene(n_clusters=6, mds_components=20)

Result & Visualization

The result is stored in genes_labels:

sp.genes_labels  # check the gene label
# If you want to save this table, run:
sp.genes_labels.to_csv('genes_labels.csv')

The output looks like the following:

	gene_id	labels
0	Cldn5	2
1	Fyco1	2
2	Pmepa1	2
3	Arhgap5	0
4	Apc	5
..	...	...
95	Cyp2a5	0
96	X5730403I07Rik	0
97	Ltbp2	2
98	Rbp4	4
99	Hist1h1e	4

Visualize the distance array:

import seaborn as sns

sns.clustermap(sp.genes_distance_array)

Finding gene sets with interested structure

Get patterns of interested gene/gene set:

interested_genes = ["mbpa", "BX957331.1", "madd"]
sp.get_pattern_of_given_genes(gene_list=interested_genes, n_comp=10)

Compare the distance between all genes and the given gene set

from STMiner.Algorithm.distance import compare_gmm_distance

df = compare_gmm_distance(sp.custom_pattern, sp.patterns)
df.to_csv('compare_distance.csv')
df

Gene	distance
mbpa	0.8914643122002152
map1ab	0.9479574709875033
snap25a	0.9801858512442632
nsfa	0.9948239449738531
stxbp1a	0.99916307128497
...	...
lrrfip1b	1.9981586323013931
si:ch211-145h19.2	1.9995115533927301
BX248122.1	1.9996375745511945
si:dkey-7i4.24	1.9997052371268462

A lower distance indicates that the spatial expression pattern of the gene is more similar to that of the gene set of interest.

To visualize the patterns:

Note: A image path for image_path is needed if you want to show background image. In this example, you can download the processed image here. Anyway, image_path is optional, not providing background images has no impact on the calculation results.

sp.get_pattern_array(vote_rate=0.2)
img_path = 'path/to/downloaded/demo_img.png'
sp.plot.plot_pattern(heatmap=False,
                     s=10,
                     rotate=False,
                     reverse_y=True,
                     reverse_x=True,
                     vmax=95,
                     image_path="./demo_img.png",
                     cmap="Spectral_r",
                     aspect=.55)

Visualize the intersections between patterns 3 & 1:

sp.plot.plot_intersection(pattern_list=[0, 1],
                          image_path=img_path,
                          reverse_y=True,
                          reverse_x=True,
                          aspect=0.55,
                          s=20)

To visualize the gene expression by labels:

sp.plot.plot_genes(label=0, vmax=99)

Attributes of STMiner.SPFinder Object

Attributes	Type	Description
adata	Anndata	Anndata for loaded spatial data
patterns	dict	Spatial distributions pattern of genes
genes_patterns	dict	GMM model for each gene
global_distance	pd. DataFrame	Distances between genes and background
mds_features	array	embedding features of genes
genes_distance_array	pd. DataFrame	Distance between each GMM
genes_labels	pd. DataFrame	Gene name and their pattern labels
plot	Object	Call plot to visualization

📜 Release history

Version	Date	Description
1.1.4	2026/04/24	package metadata cleanup for PyPI release
1.1.3	2025/11/28	update load_marked_image(), support n_components
1.1.2	2025/08/28	fix bug, add test
1.1.1	2025/07/05	fix bug, add test
1.1.0	2025/04/01	optimize multiple threads in spatial_high_variable_genes()
1.0.9	2025/03/16	add merge_bin when read_h5ad() for large ST data
0.0.9	2025/03/13	support multiple threads in spatial_high_variable_genes()
0.0.8	2025/02/23	change default value of Normalize
0.0.7	2025/02/21	improved performance of get_pattern_array()

PyPI history: https://pypi.org/project/STMiner/#history

🔖 Referance

[1] Sun, P., Bush, S. J., Wang, S., Jia, P., Li, M., Xu, T., ... & Ye, K. (2025). STMiner: Gene-centric spatial transcriptomics for deciphering tumor tissues. Cell Genomics, 5(2).
[2] Sun, P., Li, M., & Ye, K. (2025). Protocol to decipher complex spatial transcriptomics data using STMiner. STAR Protocols, 6(2), 103838.

✉️ Contact

If you encounter any issues during use, please try updating STMiner to the latest version. If the issue persists, feel free to submit your problem on the issue page or contact us through the following methods:

Peisen Sun: 📧(sunpeisen@stu.xjtu.edu.cn) / 𝕏(https://x.com/Sun_python)
Kai Ye: 📧(kaiye@xjtu.edu.cn)

⭐ Star History

🌐 Visitor

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1.4

Apr 24, 2026

1.1.3

Nov 27, 2025

1.1.2

Aug 28, 2025

1.1.1

Jul 5, 2025

1.1.0

Apr 1, 2025

1.0.9

Mar 16, 2025

0.0.9

Mar 13, 2025

0.0.8

Feb 22, 2025

0.0.7

Feb 21, 2025

0.0.6

Jan 29, 2025

0.0.5

Jan 27, 2025

0.0.4

Dec 11, 2024

0.0.3

Sep 23, 2024

0.0.2

Jun 18, 2024

0.0.1

Jun 13, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stminer-1.1.4.tar.gz (55.4 kB view details)

Uploaded Apr 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stminer-1.1.4-py3-none-any.whl (56.3 kB view details)

Uploaded Apr 24, 2026 Python 3

File details

Details for the file stminer-1.1.4.tar.gz.

File metadata

Download URL: stminer-1.1.4.tar.gz
Upload date: Apr 24, 2026
Size: 55.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for stminer-1.1.4.tar.gz
Algorithm	Hash digest
SHA256	`10499cc0490bb11d5363d9b7001f76a3e2b983130ab54d4095927ce1b0eac1f5`
MD5	`2bdc41c5099dcedf8543096a6e876020`
BLAKE2b-256	`2763be45904d18365e3dc8fe303c2f5b4e9873ccf2c15d43f9580dd20b23f1a1`

See more details on using hashes here.

File details

Details for the file stminer-1.1.4-py3-none-any.whl.

File metadata

Download URL: stminer-1.1.4-py3-none-any.whl
Upload date: Apr 24, 2026
Size: 56.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for stminer-1.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d6ded565555c62552b959a02f9889199fca101ab67ee8cce8a2500fdab0cf00d`
MD5	`899d15fea14991d89e5e5cbba637c0d9`
BLAKE2b-256	`b76ae95a82007582b09896e99a9be58eb3181280003ac61b86ce5573db60d66a`

See more details on using hashes here.

stminer 1.1.4

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

👩‍🏫 Introduction

Why STMiner?

Method detail

🚀 Quick start by example

Import package

Load data

Find spatial high variable genes

Preprocess and Fit GMM

Build distance matrix & clustering

Result & Visualization

Visualize the distance array:

Finding gene sets with interested structure

To visualize the patterns:

Visualize the intersections between patterns 3 & 1:

To visualize the gene expression by labels:

Attributes of STMiner.SPFinder Object

📜 Release history

🔖 Referance

✉️ Contact

⭐ Star History

🌐 Visitor

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes