Python package for spatial transcriptomics data analysis
Project description
Introduction
Spatial transcriptomics revolutionizes transcriptomics by incorporating positional information. However, an emergency problem is to find out the gene expression pattern which can reveal the special region in tissue and find out the genes only expression in those regions.
Here we propose “STMiner” based on the Gaussian mixture model to solve this problem. STMiner is a bottom-up methodology algorithm. It is initiated by fitting a parametric model of gene spatial distributions and constructing a distance array between them utilizing the Hellinger distance. Genes are clustered, thereby recognizing spatial co-expression patterns across distinct gene classes.
Please visit STMiner Documents for details.
Quick start by example
import package
from STMiner import SPFinder
Load data
You can download test data here.
sp = SPFinder()
file_path = 'D://10X_Visium_hunter2021spatially_sample_C_data.h5ad'
sp.read_h5ad(file=file_path)
Find spatial high variable genes
sp.get_genes_csr_array(min_cells=500, log1p=False)
sp.spatial_high_variable_genes()
You can check the distance of each genes by
sp.global_distance
| Gene | Distance |
|---|---|
| geneA | 9998 |
| geneB | 9994 |
| ... | ... |
| geneC | 8724 |
Preprocess and Fit GMM
sp.fit_pattern(n_comp=20, gene_list=list(sp.global_distance[:1000]['Gene']))
Each GMM model has 20 components.
Build distance matrix & clustering
sp.build_distance_array()
sp.cluster_gene(n_clusters=6, mds_components=20)
Result & Visualization
The result is stored in genes_labels:
sp.genes_labels
The output looks like the following:
| gene_id | labels | |
|---|---|---|
| 0 | Cldn5 | 2 |
| 1 | Fyco1 | 2 |
| 2 | Pmepa1 | 2 |
| 3 | Arhgap5 | 0 |
| 4 | Apc | 5 |
| .. | ... | ... |
| 95 | Cyp2a5 | 0 |
| 96 | X5730403I07Rik | 0 |
| 97 | Ltbp2 | 2 |
| 98 | Rbp4 | 4 |
| 99 | Hist1h1e | 4 |
To visualize the patterns:
sp.get_pattern_array(vote_rate=0.3)
sp.plot.plot_pattern(vmax=99,
heatmap=False,
s=5,
reverse_y=True,
reverse_x=True,
image_path='E://cut_img.png',
rotate_img=True,
k=4,
aspect=0.55)
Visualize the intersections between patterns 3 & 1:
sp.plot.plot_intersection(pattern_list=[0, 1],
image_path='E://OneDrive - stu.xjtu.edu.cn/paper/cut_img.png',
reverse_y=True,
reverse_x=True,
aspect=0.55,
s=20)
To visualize the gene expression by labels:
sp.plot.plot_genes(label=0, vmax=99)
Attribute of STMiner.SPFinder Object
| Attribute | Type | Description |
|---|---|---|
| adata | Anndata | Anndata for loaded spatial data |
| global_distance | pd.DataFrame | OT distance between gene and background |
| genes_labels | pd.DataFrame | Gene name and their pattern labels |
| genes_patterns | dict | GMM model for each gene |
| genes_distance_array | pd.DataFrame | Distance between each GMM |
| kmeans_fit_result | obj | Result of k-means |
| mds_features | pd.DataFrame | embedding features after MDS |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file STMiner-0.0.3.tar.gz.
File metadata
- Download URL: STMiner-0.0.3.tar.gz
- Upload date:
- Size: 39.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75a440e556acc2d5ed53543df058fa3dc3f3eed01819de77707250cac2cdd178
|
|
| MD5 |
33408670a8078936141d7430bd485cdb
|
|
| BLAKE2b-256 |
a25b96d9df454ebe0e2d401a68c84cfcbc834baf8ce16db86a0838382fd2cd1c
|
File details
Details for the file STMiner-0.0.3-py3-none-any.whl.
File metadata
- Download URL: STMiner-0.0.3-py3-none-any.whl
- Upload date:
- Size: 46.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fab19a8fec4d6d008a44093581ef012480658a5f83337e52c05d0cebcda701d4
|
|
| MD5 |
08397e19e3eba5bc252500269f66d687
|
|
| BLAKE2b-256 |
7ae537b06553812b3c7828effa6e15dc26e25fef224b1e22ccbdf1bc83ee615e
|