GEARS
Project description
GEARS: Predicting transcriptional outcomes of novel multi-gene perturbations
This repository hosts the official implementation of GEARS, a method that can predict transcriptional response to both single and multi-gene perturbations using single-cell RNA-sequencing data from perturbational screens.
Installation
Install PyG, and then do pip install cell-gears
.
[New] Updates in v0.1.1
- Fixed training breakpoint bug from v0.1.0
- Preprocessed dataloader now available for Replogle 2022 RPE1 and K562 essential datasets
- Added custom split, fixed no-test split
Core API Interface
Using the API, you can (1) reproduce the results in our paper and (2) train GEARS on your perturbation dataset using a few lines of code.
from gears import PertData, GEARS
# get data
pert_data = PertData('./data')
# load dataset in paper: norman, adamson, dixit.
pert_data.load(data_name = 'norman')
# specify data split
pert_data.prepare_split(split = 'simulation', seed = 1)
# get dataloader with batch size
pert_data.get_dataloader(batch_size = 32, test_batch_size = 128)
# set up and train a model
gears_model = GEARS(pert_data, device = 'cuda:8')
gears_model.model_initialize(hidden_size = 64)
gears_model.train(epochs = 20)
# save/load model
gears_model.save_model('gears')
gears_model.load_pretrained('gears')
# predict
gears_model.predict([['CBL', 'CNN1'], ['FEV']])
gears_model.GI_predict(['CBL', 'CNN1'], GI_genes_file=None)
To use your own dataset, create a scanpy adata object with a gene_name
column in adata.var
, and two columns condition
, cell_type
in adata.obs
. Then run:
pert_data.new_data_process(dataset_name = 'XXX', adata = adata)
# to load the processed data
pert_data.load(data_path = './data/XXX')
Demos
Name | Description |
---|---|
Dataset Tutorial | Tutorial on how to use the dataset loader and read customized data |
Model Tutorial | Tutorial on how to train GEARS |
Plot top 20 DE genes | Tutorial on how to plot the top 20 DE genes |
Uncertainty | Tutorial on how to train an uncertainty-aware GEARS model |
Colab
Name | Description |
---|---|
Using Trained Model | Use a model trained on Norman et al. 2019 to make predictions (Needs Colab Pro) |
Cite Us
@article{roohani2023predicting,
title={Predicting transcriptional outcomes of novel multigene perturbations with gears},
author={Roohani, Yusuf and Huang, Kexin and Leskovec, Jure},
journal={Nature Biotechnology},
year={2023},
publisher={Nature Publishing Group US New York}
}
Paper: Link
Code for reproducing figures: Link
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cell-gears-0.1.2.tar.gz
.
File metadata
- Download URL: cell-gears-0.1.2.tar.gz
- Upload date:
- Size: 28.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11b8a180b0af7fa797999c0272a5f1c39f5e89bb6d09b784df5cd2b892507959 |
|
MD5 | d3c7932ef18cc2fa6dedb85a8bd549a5 |
|
BLAKE2b-256 | 3734bb7c4a418fbb3c1f7216bc1d6396b328ef9c338631c616295da7d444d6b3 |
File details
Details for the file cell_gears-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: cell_gears-0.1.2-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4db11fd69ce4825cf9a3e4a533c7e864ad0de3d60260bc9e88fc5d6a3c7b095a |
|
MD5 | ac42ce67baa90be9499d97e090856a3d |
|
BLAKE2b-256 | ac557732f2dfc51c9685da00c6b824c2f8ad734a6ec4fcf54db6890c8bfc1ff1 |