Skip to main content

GEARS

Project description

GEARS: Predicting transcriptional outcomes of novel multi-gene perturbations

This repository hosts the official implementation of GEARS, a method that can predict transcriptional response to both single and multi-gene perturbations using single-cell RNA-sequencing data from perturbational screens.

gears

Installation

Install PyG, and then do pip install cell-gears.

[New] Updates in v0.1.1

  • Fixed training breakpoint bug from v0.1.0
  • Preprocessed dataloader now available for Replogle 2022 RPE1 and K562 essential datasets
  • Added custom split, fixed no-test split

Core API Interface

Using the API, you can (1) reproduce the results in our paper and (2) train GEARS on your perturbation dataset using a few lines of code.

from gears import PertData, GEARS

# get data
pert_data = PertData('./data')
# load dataset in paper: norman, adamson, dixit.
pert_data.load(data_name = 'norman')
# specify data split
pert_data.prepare_split(split = 'simulation', seed = 1)
# get dataloader with batch size
pert_data.get_dataloader(batch_size = 32, test_batch_size = 128)

# set up and train a model
gears_model = GEARS(pert_data, device = 'cuda:8')
gears_model.model_initialize(hidden_size = 64)
gears_model.train(epochs = 20)

# save/load model
gears_model.save_model('gears')
gears_model.load_pretrained('gears')

# predict
gears_model.predict([['CBL', 'CNN1'], ['FEV']])
gears_model.GI_predict(['CBL', 'CNN1'], GI_genes_file=None)

To use your own dataset, create a scanpy adata object with a gene_name column in adata.var, and two columns condition, cell_type in adata.obs. Then run:

pert_data.new_data_process(dataset_name = 'XXX', adata = adata)
# to load the processed data
pert_data.load(data_path = './data/XXX')

Demos

Name Description
Dataset Tutorial Tutorial on how to use the dataset loader and read customized data
Model Tutorial Tutorial on how to train GEARS
Plot top 20 DE genes Tutorial on how to plot the top 20 DE genes
Uncertainty Tutorial on how to train an uncertainty-aware GEARS model

Colab

Name Description
Using Trained Model Use a model trained on Norman et al. 2019 to make predictions (Needs Colab Pro)

Cite Us

@article{roohani2023predicting,
  title={Predicting transcriptional outcomes of novel multigene perturbations with gears},
  author={Roohani, Yusuf and Huang, Kexin and Leskovec, Jure},
  journal={Nature Biotechnology},
  year={2023},
  publisher={Nature Publishing Group US New York}
}

Paper: Link

Code for reproducing figures: Link

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cell-gears-0.1.2.tar.gz (28.9 kB view details)

Uploaded Source

Built Distribution

cell_gears-0.1.2-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file cell-gears-0.1.2.tar.gz.

File metadata

  • Download URL: cell-gears-0.1.2.tar.gz
  • Upload date:
  • Size: 28.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for cell-gears-0.1.2.tar.gz
Algorithm Hash digest
SHA256 11b8a180b0af7fa797999c0272a5f1c39f5e89bb6d09b784df5cd2b892507959
MD5 d3c7932ef18cc2fa6dedb85a8bd549a5
BLAKE2b-256 3734bb7c4a418fbb3c1f7216bc1d6396b328ef9c338631c616295da7d444d6b3

See more details on using hashes here.

File details

Details for the file cell_gears-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: cell_gears-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for cell_gears-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4db11fd69ce4825cf9a3e4a533c7e864ad0de3d60260bc9e88fc5d6a3c7b095a
MD5 ac42ce67baa90be9499d97e090856a3d
BLAKE2b-256 ac557732f2dfc51c9685da00c6b824c2f8ad734a6ec4fcf54db6890c8bfc1ff1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page