Single-cell perturbation modeling toolkit
Project description
A Multi-modal LLM-Knowledge Fusion Framework for Predicting Single-cell Genetic Perturbation Effects
scPert, a multi-modal transformer framework that integrates large language model embeddings with structured biological knowledge to predict single-cell transcriptomic responses to genetic perturbations. Through hierarchical fusion of knowledge graph representations, contextual embeddings from foundation models, and gene-specific encodings, scPert achieves significant performance improvements in both single-gene and combinatorial perturbations over existing methods. In cancer-relevant applications, scPert demonstrates the capability to reveal p53 pathway dynamics and immune checkpoint regulatory mechanisms. Systematic evaluation on 42 cancer dependency genes demonstrates scPert's ability to identify critical potential therapeutic targets. Our framework establishes a powerful computational foundation for virtual cell construction and accelerates drug target discovery.
Installation
Install PyG, and then do pip install scpert
Requirement
- anndata==0.9.2
- scanpy==1.9.8
- torch==2.3.0
- torch-geometric==2.6.1
- scvi-tools==0.20.3
- pandas==2.0.3
- numpy==1.24.4
- scipy==1.10.1
- cell-gears==0.0.2
- nvidia-cublas-cu12==12.1.3.1
- nvidia-cudnn-cu12==8.9.2.26
- flash_attn==0.2.8
Usage
embedding_dir: Directory containing gene embedding files (.npy)
data_path: Base directory for perturbation datasets
model_path: Pretrained model directory containing model.pt
pert_file:CSV file specifying perturbation pairs with columns: gene1,gene2
You can train scPert on your perturbation dataset simply running the Python script:
python ./scripts/train.py
You can use scPert to predict single gene or gene pairs perturbations by running the scripts:
python ./scripts/infer.py
Using API Interface:
from scpert import ProcePertdata,scPert
pertData = ProcePertdata(data_path)
pertData.load(DataName = 'norman')
# training
SCPert = scPert(pertData, device = 'cuda:0')
SCPert.model_initialize(hidden_size = 64)
SCPert.train(epochs = 20)
# saving or loading model
SCPert.save_model(model_path)
SCPert.load_pretrained(model_path)
# predict
SCPert.predict([['PRDM1+CBFA2T3'], ['FEV']])
SCPert.GI_predict(['CBL', 'CNN1'])
Cite Us
This work is currently under peer review.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scpert-0.1.0.tar.gz.
File metadata
- Download URL: scpert-0.1.0.tar.gz
- Upload date:
- Size: 36.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
feacd4f2c6a64c594ad8b49f2d21bf4997bb9b9451af7e53a1967cced4fd2633
|
|
| MD5 |
99001525a4d3efb00abace99550387a2
|
|
| BLAKE2b-256 |
3b5a2c6fc9151b4af902819bbb6243f64359ce2053d3401dd04a2ae0bcb3e5d2
|
File details
Details for the file scpert-0.1.0-py3-none-any.whl.
File metadata
- Download URL: scpert-0.1.0-py3-none-any.whl
- Upload date:
- Size: 39.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4df77eb59b86a31d6c156f38642abff26458e1d52b3c36449c43c9dd671b63fa
|
|
| MD5 |
312ee13d53eaa0038c55c66bd777a4f1
|
|
| BLAKE2b-256 |
52d0d2531293f244cceeb080cbc2ce3c53f78bbd706e3f611341441462796d5f
|