Fine-scale cellular deconvolution based on Generalized Cross Entropy
Project description
quipcell
Fine-scale cellular deconvolution based on Generalized Cross Entropy
A method to perform cellular deconvolution at a fine-scale (i.e. single-cell or neighborhood level), using a generalization of maximum entropy that is also an efficient convex optimization problem.
Installation
Installation
pip install quipcell
Method
A preprint describing the method is available on bioRxiv.
Usage
The documentation includes a vignette and API reference.
The snippet below demonstrates how to obtain weights from two AnnData's containing the single cell reference and the bulk samples to deconvolve. Both AnnDatas are assumed to contain the raw counts, and to have the same genes. Dimensionality reduction steps (PCA and LDA) are run on the single cell reference, then applied to the bulk data, and then quipcell estimates single cell weights using a generalized maximum entropy method. The resulting weights represent the probability that a random read from the bulk sample originated from "near" the reference cell.
import scanpy as sc
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
# save the number of UMIs per cell
adata_ref.obs['n_umi'] = adata_ref.X.sum(axis=1)
# Normalize the single cell reference
normalize_sum = 1e3
sc.pp.normalize_total(adata_ref, target_sum=normalize_sum)
# Compute low dimensional features via PCA and LDA on the single cells
sc.pp.pca(adata_ref, n_comps=100)
lda = LinearDiscriminantAnalysis(n_components=15)
lda.fit(adata_ref.obsm['X_pca'], adata_ref.obs['celltype'])
adata_ref.obsm['X_lda'] = lda.transform(adata_ref.obsm['X_pca'])
# Normalize the bulk samples
sc.pp.normalize_total(adata_bulk, target_sum=normalize_sum)
# Apply PCA rotation to pseudobulks. Note the centers need to be
# subtracted before rotation
X = adata_bulk.X - np.squeeze(np.asarray(adata_ref.X.mean(axis=0)))
X = np.asarray(X @ adata_ref.varm['PCs'])
adata_bulk.obsm['X_pca'] = X
# Apply LDA rotation to pseudobulks
adata_bulk.obsm['X_lda'] = lda.transform(X)
# Compute the weights. Rows are reference single cells, columns are
# bulk samples. The entries represent the probability that a random
# read from the bulk sample originated near the respective reference
# cell.
w_reads = qpc.estimate_weights_multisample(adata_ref.obsm['X_lda'],
adata_bulk.obsm['X_lda'])
Note
This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file quipcell-0.8.tar.gz
.
File metadata
- Download URL: quipcell-0.8.tar.gz
- Upload date:
- Size: 2.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a18559297e3ca1afc4957f1a131f9310bb7e58a36bf85b8c7d8c4257ebc05eeb |
|
MD5 | 78b056956f0424e11844d862ab1bd40e |
|
BLAKE2b-256 | ce460f0cf4f5c674304f763074e0c3212bf01789dee5d620b707043883abf504 |
File details
Details for the file quipcell-0.8-py3-none-any.whl
.
File metadata
- Download URL: quipcell-0.8-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25c56401aa87d94b35553e7084cd769313f240a02f2915e5a75a02303d3f7a96 |
|
MD5 | b3e3cd2ef90a84a8548a331709296064 |
|
BLAKE2b-256 | 0408810531d1bcbde1a73021cd7bac0a4c3546b412affc23d529f8e122d82469 |