Clean pure-Python reimplementation of scDRS — single-cell disease-relevance scoring from GWAS gene sets.
Project description
py-scdrs
pyscdrs is a clean, pure-Python reimplementation of
scDRS (single-cell
disease-relevance score) for the omicverse
project.
scDRS (Zhang*, Hou*, et al. & Price, Nature Genetics 2022, 54:1572-1580) scores individual cells in scRNA-seq data for their relevance to a disease or complex trait, using a polygenic gene set derived from GWAS summary statistics (typically via MAGMA). The score is a covariate-corrected, technical-variance-weighted average of disease-gene expression, calibrated against Monte-Carlo control gene sets matched on the gene-level mean-variance relationship.
This package is a faithful rewrite — with the same random_seed it
reproduces the original scdrs package's control gene sets, scores and
p-values bit-for-bit on the bundled toy data.
Install
pip install pyscdrs
Pure Python — depends only on numpy, scipy, pandas, anndata,
scanpy, scikit-misc, statsmodels and tqdm. No R / rpy2.
Quick start
import anndata, pandas as pd
import pyscdrs as scdrs
# Load size-factor-normalized, log1p-transformed single-cell data
adata = anndata.read_h5ad("toydata_mouse.h5ad")
df_cov = pd.read_csv("toydata_mouse.cov", sep="\t", index_col=0)
# 1. Preprocess: covariate correction + gene/cell stats + mean-var bins
scdrs.preprocess(adata, cov=df_cov)
# 2. Load a MAGMA-style .gs gene set and score cells
dict_gs = scdrs.load_gs("toydata_mouse.gs")
genes, weights = dict_gs["toydata_gs_mouse"]
df_res = scdrs.score_cell(
adata, genes, gene_weight=weights, n_ctrl=1000, random_seed=0,
return_ctrl_norm_score=True,
)
print(df_res[["raw_score", "norm_score", "pval", "zscore"]].head())
# 3. Downstream: cell-group association + heterogeneity
import scanpy as sc
sc.pp.neighbors(adata, n_neighbors=15, n_pcs=20)
dict_group = scdrs.downstream_group_analysis(
adata, df_res, group_cols=["cell_type"]
)
Public API
| Stage | Function |
|---|---|
| Preprocessing | preprocess, compute_stats, reg_out, category2dummy |
| Scoring | score_cell |
| Downstream | downstream_group_analysis, downstream_corr_analysis, downstream_gene_analysis, test_gearysc, gearys_c |
| Gene-set / I/O | load_gs, save_gs, munge_gs, load_h5ad, load_homolog_mapping, convert_species_name, zsc2pval, pval2zsc |
score_cell options
weight_opt— raw-score weighting:uniform,vs(1/sqrt technical variance, default),inv_std(1/std),od(overdispersion score).ctrl_match_key— gene statistic for matching control genes (defaultmean_var).n_ctrl— number of Monte-Carlo control gene sets (default 1000).random_seed— governs the control gene sets; the same seed gives the same results as the originalscdrs.
The output is a per-cell DataFrame with raw_score, norm_score,
mc_pval (per-cell Monte-Carlo p-value), pval (pooled empirical
p-value), nlog10_pval and zscore.
Command line
pyscdrs compute-score --h5ad-file data.h5ad --gs-file trait.gs \
--cov-file cov.tsv --out-folder out/ --n-ctrl 1000 --flag-full-score
pyscdrs perform-downstream --h5ad-file data.h5ad \
--full-score-file out/trait.full_score.gz --out-folder out/ \
--group-analysis cell_type --gene-analysis
Parity with the original scDRS
tests/test_parity.py runs both pyscdrs and the upstream scdrs
package on scDRS's own bundled toy data with identical random_seed and
asserts agreement of preprocess (gene stats and mean-variance bins),
score_cell (raw_score, norm_score, pval, mc_pval, zscore) and
the downstream group / gene / correlation statistics. On the toy data the
agreement is bit-exact.
License
MIT, same as the original scDRS. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyscdrs-0.1.0.tar.gz.
File metadata
- Download URL: pyscdrs-0.1.0.tar.gz
- Upload date:
- Size: 147.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48d92796a25c72d7555f49663de3d124905b60c66fed65a038c7a55a7a19cc44
|
|
| MD5 |
e8511d9b835ed0568f240de2b772ee3b
|
|
| BLAKE2b-256 |
6f00252deb54f491b370c47abc3862b4971783301ef24140b36ca9cb8a089dc8
|
Provenance
The following attestation bundles were made for pyscdrs-0.1.0.tar.gz:
Publisher:
publish.yml on omicverse/py-scdrs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyscdrs-0.1.0.tar.gz -
Subject digest:
48d92796a25c72d7555f49663de3d124905b60c66fed65a038c7a55a7a19cc44 - Sigstore transparency entry: 1582830737
- Sigstore integration time:
-
Permalink:
omicverse/py-scdrs@1311dcda13872e594367e27d29930b4d20fd780e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1311dcda13872e594367e27d29930b4d20fd780e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file pyscdrs-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pyscdrs-0.1.0-py3-none-any.whl
- Upload date:
- Size: 143.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8418b8a99953ae8e4a0313a64b54d17a3e0ba676838c1208b737d98a990c00e
|
|
| MD5 |
ff3e80016a98582ecc6c5551d2188191
|
|
| BLAKE2b-256 |
018e0663c28228c81ab7a0a4538e9cbd439eb8568c06c8dc98a235679bdf0039
|
Provenance
The following attestation bundles were made for pyscdrs-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on omicverse/py-scdrs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyscdrs-0.1.0-py3-none-any.whl -
Subject digest:
e8418b8a99953ae8e4a0313a64b54d17a3e0ba676838c1208b737d98a990c00e - Sigstore transparency entry: 1582830826
- Sigstore integration time:
-
Permalink:
omicverse/py-scdrs@1311dcda13872e594367e27d29930b4d20fd780e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1311dcda13872e594367e27d29930b4d20fd780e -
Trigger Event:
workflow_dispatch
-
Statement type: