CellDrift: A Python Package to Infer Temporal Patterns of Peturbation Effects in Single Cell Data

Project description

CellDrift

CellDrift: temporal perturbation effects for single cell data

Perturbation effects on gene programs are commonly investigated in single-cell experiments. Existing models measure perturbation responses independently across time series, disregarding the temporal consistency of specific gene programs. We introduce CellDrift, a generalized linear model based functional data analysis approach to investigate temporal gene patterns in response to perturbations. overview

Reference

CellDrift: Inferring Perturbation Responses in Temporally-Sampled Single Cell Data. BioRxiv. Apr 2022 (https://www.biorxiv.org/content/10.1101/2022.04.13.488194v1)

Prerequisite

# It's recommended to create a new environment using conda (python 3.7 is recommended)
conda create -n celldrift_py python=3.7
# Install prerequisite package scikit-fda (development version)
conda activate celldrift_py # activate celldrift environment
pip install git+https://github.com/GAA-UAM/scikit-fda.git

Installation

git clone https://github.com/KANG-BIOINFO/CellDrift.git
cd CellDrift
pip install .

Tutorial

Quick Start

import numpy as np
import pandas as pd
import scanpy as sc
import CellDrift as ct

1. Load data and preparation

adata = sc.read("example.h5ad")
adata.obs['size_factor'] = np.sum(adata.X, axis = 1)

2. Set up CellDrift object

adata = ct.setup_celldrift(
    adata, 
    cell_type_key = 'cell_type',
    perturb_key = 'perturb', 
    time_key = 'time', # the name of time covariate. Must be numeric
    control_name = 'Control', 
    perturb_name = None, 
    size_factor_key = 'size_factor', 
    batch_key = 'batch', 
    n_reps = 3,
    n_cells_perBlock = 100,
    use_pseudotime = False,
    min_cells_perGene = 0
)

3. Run GLM model

adata = ct.model_timescale(
    adata, 
    n_processes = 16, # number of processes for multiprocessing
    chunksize = 100, # number of genes in each chunk
    pairwise_contrast_only = True, 
    adjust_batch = False
)

4. set up FDA object

df_zscore = pd.read_csv('Temporal_CellDrift/Contrast_Coefficients_combined_zscores_.txt', sep = '\t', header = 0, index_col = 0) # CellDrift z scores
df_meta = pd.read_csv('Temporal_CellDrift/Contrast_Coefficients_combined_metadata_.txt', sep = '\t', header = 0, index_col = 0) # metadata of contrast comparisons

fda = ct.FDA(df_zscore, df_meta)

5. temporal clustering

fd, genes = fda.create_fd_genes(genes = df_zscore.index.values, cell_type = 'Type_0', perturbation = 'Perturb_0')
df_cluster = ct.fda_cluster(fd, genes, n_clusters = 3)

6. visualization for each temporal cluster

ct.draw_smoothing_clusters(
    fd, 
    df_cluster, 
    n_neighbors = 2, 
    bandwidth = 1,
    cluster_key = 'clusters_fuzzy', 
    output_folder = 'Temporal_CellDrift/cluster_fuzzy/'
)

Project details

Release history Release notifications | RSS feed

This version

0.1.3

Jul 8, 2022

0.1.0

Jul 7, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

CellDrift-0.1.3.tar.gz (2.3 kB view details)

Uploaded Jul 8, 2022 Source

Built Distribution

CellDrift-0.1.3-py3-none-any.whl (2.7 kB view details)

Uploaded Jul 8, 2022 Python 3

File details

Details for the file CellDrift-0.1.3.tar.gz.

File metadata

Download URL: CellDrift-0.1.3.tar.gz
Upload date: Jul 8, 2022
Size: 2.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.7.10

File hashes

Hashes for CellDrift-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`54b7a369f1ac97baec398b82d5292a68d43d6533a787c5b53a9d7f0677401346`
MD5	`b6bd2a8e1284c73008d7f997074476bb`
BLAKE2b-256	`1235d141fb284184265af08784148f17a7e00d50b5f159087dc3979a6bba32bd`

See more details on using hashes here.

File details

Details for the file CellDrift-0.1.3-py3-none-any.whl.

File metadata

Download URL: CellDrift-0.1.3-py3-none-any.whl
Upload date: Jul 8, 2022
Size: 2.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.7.10

File hashes

Hashes for CellDrift-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`18b65ee9c51329a825e14b516b9cc2ea39eca2a7429d347b98e69a14cf275442`
MD5	`1f96b1ea5e6403b4e2bcfab9a444cdf4`
BLAKE2b-256	`fba1c885c40e1a55928885eaa951e687218be832798ff4cb2ffbd54a7430b966`