Skip to main content

Quantifying predictability of gene expression from histology image

Project description

stars-badge build-badge license-badge Python>=3.10 codecov PyPI Downloads

Expression Copilot

We introduce two metrics: EPS (Expression Predictability Score) and SPS (Slice Predictability Score), to quantify the predictability of gene expression from histology image. Python package expression_copilot is developed to calculate these metrics efficiently. It also provides several baseline models to predict gene expression from image embeddings, such as MLP and linear regression.

expression_copilot

Installation

PyPI

[!IMPORTANT] Requires Python >= 3.10

We recommend to install expression_copilot to a new conda environment:

conda create -n eps python=3.11 -y && conda activate eps
pip install expression_copilot

(Optional) If you have CUDA-enabled GPU, you could install cuml&cupy to accelerate KNN building, and install torch to accelerate MLP baseline training:

conda create -n eps_cuda -c conda-forge -c rapidsai -c nvidia python=3.11 rapids=25.06 'cuda-version>=12.0,<=12.8' -y && conda activate eps_cuda
pip install expression_copilot[torch]

Docker

You could also use our pre-built docker image directly:

# GPU version
docker run --gpus all -it --rm huhansan666666/expression_copilot:latest

# CPU version
docker run -it --rm huhansan666666/expression_copilot:latest

Documentation

Quick Start

The following code snippet shows how to calculate EPS and SPS via expression_copilot package. We assume you have already preprocessed your spatial transcriptomics data into an AnnData object (adata), where adata.X should store raw counts and adata.obsm['IMAGE_KEY_NAME'] should store image embeddings of spots. (Preprocessed steps are described in Advanced Tutorial in detail)

import scanpy as sc
import numpy as np
from expression_copilot import ExpressionCopilotModel

# Load data
# adata.X is raw counts
# adata.obsm['X_uni'] stores image embeddings of spots
url = 'https://drive.google.com/uc?id=10WD9vFgsoMoTt6g3017XxNK_bq8qp3oM'
adata = sc.read('./adata_with_image_emb.h5ad', backup_url=url)

# Init model
model = ExpressionCopilotModel(adata, image_key = 'X_uni')

# Calculate EPS and SPS
eps = model.calc_metrics_per_gene()
sps = eps.mean()

# Run baseline model (support 'ridge', 'linear', 'ensemble', 'mlp')
baseline_metrics_per_gene, _ = model.calc_baseline_metrics(method = 'mlp')

Notebook tutorials

We provide several tutorials in the resource/tutorials folder. You could also run them in Google Colab directly:

Name Description Colab
Basic Tutorial Basic tutorial of calculating EPS Open In Colab
Advanced Tutorial Start with 10x spatial-ranger output from scratch Open In Colab
Multi-omics Tutorial Calculating EPS and SPS on single cell multi-omics data Open In Colab

Citation

In coming.

If you want to repeat results in the manuscript, please check the experiments folder.

FAQ

Please open a new github issue if you have any question.

  1. numba related bugs

We use numba to increase the speed (up to 12x). However, it may have compatibility issues with different python/numpy versions. We tested the latest version of numba (0.6.12) and it works fine with Python 3.11/3.12, numpy 1.26.

Acknowledgement

We thank the following great open-source projects for their help or inspiration:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

expression_copilot-0.3.0.tar.gz (13.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

expression_copilot-0.3.0-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file expression_copilot-0.3.0.tar.gz.

File metadata

  • Download URL: expression_copilot-0.3.0.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for expression_copilot-0.3.0.tar.gz
Algorithm Hash digest
SHA256 19cafb3d2105bab0f609e051db1e3fdc395c35e227ace65fe4c3dbcfec8149ac
MD5 36a39a6c4c854d533aeb94b83510be79
BLAKE2b-256 08bdc22a4943ffb8d914b78640014f9227262d30a3524431f59b07c5b5ae6cbf

See more details on using hashes here.

File details

Details for the file expression_copilot-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for expression_copilot-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4a969ec720d1d6edd44713e097fbf3be5aae987d2f1dc633eb4edc5f1913c2f6
MD5 283ccdebcb6c3039e8561abe618cf09f
BLAKE2b-256 5e22e972149b07aaeefea3a251176666f79b0aee0f4fffbc5083b21fbf598443

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page