An automated cell type annotation algorithm for unmatched spatial transcriptomics data

Project description

SANNO

The official implementation for "SANNO".

Datasets
Installation
Usage
Tutorial
Citation

Datasets

We provide preprocessed datasets for easy reproduction.

Download datasets from: Dataset Link

Installation

To use {Project Name}, follow these steps:

Create a conda environment:

conda create -n {SANNO} python=3.7
conda activate {SANNO}

Install dependencies:
```
pip install -r requirements.txt
```

Install PYG and Pytorch according to the CUDA version, take torch-1.13.1+cu117 (Ubuntu 20.04.4 LTS) as an example:

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install torch_geometric==2.3.0 # must be this version
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-1.13.1+cu117.html

Usage

Data Preprocessing

In order to run SANNO, we need to first create anndata from the raw data.

We require two types of datasets for this project: reference data and query data. Both datasets should be provided in .h5ad format, with cells stored in obs and genes/features stored in var.

Reference Data

Format: .h5ad
Content:
- obs: Cell metadata, including a mandatory cell_type column indicating the true cell type labels.
- var: Gene/feature metadata.
- obsm: Spatial coordinates stored under the key pos, representing the relative positions of cells as a 2D numpy array (n_cells x 2).

Query Data

Format: .h5ad
Content:
- obs: Cell metadata (cell type labels are not required).
- var: Gene/feature metadata.
- obsm: Spatial coordinates stored under the key pos, representing the relative positions of cells as a 2D numpy array (n_cells x 2).

Cell Type Annotation

The processed data are used as input to SANNO and a reference genome is provided to extract the embedding and anootation incorporating reference Spatial Transcriptomics information:

cd SANNO/SANNO

python main_xy_adj.py   --gpu_index 3 # GPU index
                        --type st2st \ # project type
                        --dataset Project name \ # project name
                        --train_dataset path/to/train_adata.h5ad \ # reference data
                        --test_dataset path/to/test_adata.h5ad \ # query data
                        --log log \ # log path

The project type must be selected based on the nature of the reference and query datasets. The following modes are supported:

st2st – For cases where both the reference and query datasets are spatial transcriptomics.
st2sc – For cases where the reference dataset is spatial transcriptomics, and the query dataset is single-cell transcriptomics.
sc2sc – For cases where both the reference and query datasets are single-cell transcriptomics.

Running the above command will generate three output files in the output path:

acc.csv: Contains the overall accuracy of the query data and SANNO predictions.
embedding.h5ad: An AnnData file storing the embeddings extracted by SANNO.
Reports: A set of logs recorded during the training process.

Tutorial 教程

Tutorial 1: Cell annotations within samples (HubMap CL A & HubMap CL B)

Install the required environment according to Installation.
Download the datasets from HubMap CL.
Preprocess the datasets according to the Data Preprocessing standards.
For more detailed information, run the tutorial HubMap_CL_intra.ipynb for how to do data preprocessing and training.

Tutorial 1: Cell annotations cross samples (Tonsil & BE)

Install the required environment according to Installation.
Download the datasets from Tonsil_BE.
Preprocess the datasets according to the Data Preprocessing standards.
For more detailed information, run the tutorial HubMap_CL_intra.ipynb for how to do data preprocessing and training.

Citation

If you use SANNO in your research, please cite:

@article{yourcitation,
  title={{Your Paper Title}},
  author={Your Name, Coauthor Name},
  journal={Journal Name},
  volume={00},
  pages={1--10},
  year={2024}
}

Project details

Release history Release notifications | RSS feed

This version

0.1.2

Mar 2, 2025

0.1.1

Mar 2, 2025

0.1

Mar 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SANNO-0.1.2.tar.gz (3.1 kB view details)

Uploaded Mar 2, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

SANNO-0.1.2-py3-none-any.whl (3.0 kB view details)

Uploaded Mar 2, 2025 Python 3

File details

Details for the file SANNO-0.1.2.tar.gz.

File metadata

Download URL: SANNO-0.1.2.tar.gz
Upload date: Mar 2, 2025
Size: 3.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.16

File hashes

Hashes for SANNO-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`4e4732735a1464da440323fb56358d5ebb6299aa21f48ce2726645824b1d2308`
MD5	`3b3d10f6308d604d715ea24a3e30a3c2`
BLAKE2b-256	`385b4dc3c8af23ade1703f9378ba9fac59db1e222f5286629150867ea61f29cc`

See more details on using hashes here.

File details

Details for the file SANNO-0.1.2-py3-none-any.whl.

File metadata

Download URL: SANNO-0.1.2-py3-none-any.whl
Upload date: Mar 2, 2025
Size: 3.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.16

File hashes

Hashes for SANNO-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d30cb0d553de91576f3cad95e68c265dcbb19c8c32a2758e9cc86e1faf242974`
MD5	`f48dfe59554637523e3d30a15a04f5b9`
BLAKE2b-256	`fc8a4641336865d8602c4b6c355797800c09b53a532115df2747ec159694b48f`

See more details on using hashes here.

SANNO 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

SANNO

Table of Contents

Datasets

Installation

Usage

Data Preprocessing

Reference Data

Query Data

Cell Type Annotation

Tutorial 教程

Tutorial 1: Cell annotations within samples (HubMap CL A & HubMap CL B)

Tutorial 1: Cell annotations cross samples (Tonsil & BE)

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes