Skip to main content

PHASE:PHenotype prediction with Attention mechanisms for Single-cell Exploring

Project description

PHASE: PHenotype prediction with Attention mechanisms for Single-cell Exploring

PHASE utilizes an attention-based neural network framework to predict clinical phenotypes from scRNA-seq data while providing interpretability of key features linked to phenotypic outcomes at both the gene and cell levels. PHASE consists of several components:

  • A data-preprocessing procedure
  • A gene feature embedding module
  • A self-attention (SA) module for cell embedding learning
  • An attention-based deep multiple instance learning (AMIL) module for aggregating all single-cell information within a sample

The manuscript has been pre-printed in bioRxiv:

Qinhua Wu, Junxiang Ding, Ruikun He, Lijian Hui, Junwei Liu, Yixue Li. Exploring phenotype-related single-cells through attention-enhanced representation learning. bioRxiv (2024). https://doi.org/10.1101/2024.10.31.619327

架构图

Installation

Installing PHASE package

PHASE is written in Python and can be installed using pip:

pip install phase-sc

Requirements

PHASE should run on any environmnet where Python is available,utilizing PyTorch for its computational needs. The training of PHASE can be done using CPUs only or GPU acceleration. If you do not have powerful GPUs available, it is possible to run using only CPUs. Before using PHASE, make sure the following packages are installed:

scanpy>=1.10.2  
anndata>=0.10.8  
torch>=2.4.0  
tqdm>=4.66.4  
numpy>=1.23.5  
pandas>=1.5.3  
scipy>=1.11.4  
seaborn>=0.13.2  
matplotlib==3.6.3  
captum==0.7.0  
scikit-learn>=1.5.1  

To install these dependencies, you can run the following command using pip:

pip install scanpy>=1.10.2 anndata>=0.10.8 torch>=2.4.0 tqdm>=4.66.4 numpy>=1.23.5 pandas>=1.5.3 scipy>=1.11.4 seaborn>=0.13.2 matplotlib==3.6.3 captum==0.7.0 scikit-learn>=1.5.1

Alternatively, if you are using a requirements.txt file, you can add these lines to your file and install using:

pip install -r requirements.txt

The PHASE pipeline

  1. Predict clinical phenotypes from scRNA-seq data

    • 1.1 Data preprocessing: Encode the data into a format that can be read by PHASE.
    • 1.2 Gene feature embedding: Extract and represent gene features.
    • 1.3 Self-attention (SA): Learn cell embeddings.
    • 1.4 Attention-based deep multiple instance learning (AMIL): aggregate all single-cell information within a sample.
  2. Provide interpretability of key phenotype-related features

    • 2.1 Attribution analysis: Use Integrated Gradients (IG) to link genes to phenotypes via attribution scores.
    • 2.2 Attention analysis: Use AMIL attention scores to relate individual cells to the phenotype.
    • 2.3 Conjoint analysis: Correlate top genes' expression levels with cells' attention scores to reveal gene-cell contributions to the phenotype.

Usages

Command Line Arguments

The following table lists the command line arguments available for training the model:

Abbreviation Parameter Description
-t --type Type of task: classification or regression.
-p --path Path to the dataset.
-r --result Path to the directory where results will be saved.
-e --epoch Number of training epochs (default: 100).
-l --learningrate Learning rate for the optimizer (default: 0.00001).
-d --devices List of GPU device IDs to use for training (default: first GPU).

Each argument is required unless a default value is specified.

Example

PHASEtrain -t classification -p /home/user/PHASE/demo_covid.h5ad -r /home/user/PHASE/result -e 100 -l 0.00001 -d 2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phase_sc-2.0.3.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

phase_sc-2.0.3-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file phase_sc-2.0.3.tar.gz.

File metadata

  • Download URL: phase_sc-2.0.3.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for phase_sc-2.0.3.tar.gz
Algorithm Hash digest
SHA256 cacb337a9752e86869124a2dd44176c851f1a3d55452664d42f48ebd67db8faa
MD5 5c1d6e874b054561fafc76f5faf9b2b6
BLAKE2b-256 355497beec9e234f9177478579d8b430c825d7f5a69d7a620164a4c23d1ff871

See more details on using hashes here.

File details

Details for the file phase_sc-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: phase_sc-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for phase_sc-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f123e2df2bce66e11c62a8ae04ec09e518f4d9a07b290103934b2cd39c3de1be
MD5 e51b14197993a7d0c867540851656c86
BLAKE2b-256 e0ebd9e9b50f3edfc3aa2ac803902d76a14df5c8466bc1beb25fb2a6e3c15cdf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page