Run embedding comparisons for single-cell data
Project description
Comparing embeddings for single-cell and spatial data
Single-cell RNA-sequencing (scRNA-seq) measures gene expression in individual cells and generates large datasets. Typically, these datasets consist of several samples, each corresponding to a combination of covariates (e.g. patient, time point, disease status, technology, etc.). Analyzing these vast datasets (often containing millions of cells for thousands of genes) is facilitated by data integration approaches, which learn lower-dimensional representations that remove the effects of certain unwanted covariates (such as experimental batch, the chip the data was run on, etc).
Here, we use slurm_sweep to efficiently parallelize and track different data integration approaches, and we compare their performance in terms of scIB metrics. For each data integration method, we compute a shared latent space, quantify integration performance in terms of batch correction and bio conservation, visualize the latent space with UMAP, store the model and embedding coordinates, and store all relevant data on wandb, so that we can retrieve it after the sweep.
scembed consists of shallow wrappers around commonly used integration tools, a class to facilitate scIB comparisons, and another class to retrieve and aggregate sweep results.
Methods included
- GPU-based methods: scVI, scANVI, scPoli, ResolVI, scVIVA
- CPU-based methods: Harmony, LIGER, Scanorama, HVG, Pre-computed embeddings
Evaluation
- scIB metrics: Standardized benchmarking for integration quality
- UMAP visualization: Visual assessment of integration
- Artifact tracking: Models and embeddings stored in wandb
Outputs
Per Method
- Integration embedding: Stored in wandb as table
- scIB metrics: Comprehensive benchmarking scores
- UMAP plots: Visualization by cell type and batch
- Model weights: For deep learning methods
Summary Metrics
- scib_total_score: Overall integration quality
- scib_bio_conservation: Preservation of biological signal
- scib_batch_correction: Removal of batch effects
Getting started
Please refer to the documentation, in particular, the API documentation.
Installation
You need to have Python 3.10 or newer installed on your system. If you don't have Python installed, we recommend installing uv.
There are several alternative options to install scembed:
- Install the latest development version:
pip install git+https://github.com/quadbio/scembed.git@main
Note: If you encounter C++ compilation errors (e.g., with louvain or annoy), install those packages via conda first:
mamba install louvain python-annoy
Dependency Groups
The package uses optional dependency groups to minimize installation overhead:
- Base: Core functionality (scanpy, scib-metrics, wandb)
[cpu]: CPU-based methods (e.g. Harmony, LIGER, Scanorama)[gpu]: GPU-based methods (e.g. scVI, scANVI, scPoli)[fast_metrics]: Accelerated evaluation withfaissandRAPIDS[all]: All optional dependencies
Release notes
See the changelog.
Contact
For questions and help requests, you can reach out in the scverse discourse. If you found a bug, please use the issue tracker.
Citation
t.b.a
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scembed-0.0.1.tar.gz.
File metadata
- Download URL: scembed-0.0.1.tar.gz
- Upload date:
- Size: 43.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53c0fce077fea176988d61cad0dee67867df61c0334cf8c33f84104d41932e0e
|
|
| MD5 |
57371aaaddf2524d78fb9a1bd3b119ae
|
|
| BLAKE2b-256 |
5fdf5a3fdff6c6db95ef52e5e8bb3ab9cd44a7e834264dfb19899ef2ee375f4d
|
Provenance
The following attestation bundles were made for scembed-0.0.1.tar.gz:
Publisher:
release.yaml on quadbio/scembed
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scembed-0.0.1.tar.gz -
Subject digest:
53c0fce077fea176988d61cad0dee67867df61c0334cf8c33f84104d41932e0e - Sigstore transparency entry: 443971439
- Sigstore integration time:
-
Permalink:
quadbio/scembed@883695eec1622183f2bf01b0b3a1ee99f0dd2cfd -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/quadbio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@883695eec1622183f2bf01b0b3a1ee99f0dd2cfd -
Trigger Event:
release
-
Statement type:
File details
Details for the file scembed-0.0.1-py3-none-any.whl.
File metadata
- Download URL: scembed-0.0.1-py3-none-any.whl
- Upload date:
- Size: 34.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1aad9946c487ede2587f16106eb65d32146e1e25b183d1e1fd8afd864025c823
|
|
| MD5 |
a0f7560a48496fbc8cb12f04447e3224
|
|
| BLAKE2b-256 |
85d3b0583dfa1991a38b350bedd2bf71b6766a13e852474462eb91ceb2fd0cbe
|
Provenance
The following attestation bundles were made for scembed-0.0.1-py3-none-any.whl:
Publisher:
release.yaml on quadbio/scembed
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scembed-0.0.1-py3-none-any.whl -
Subject digest:
1aad9946c487ede2587f16106eb65d32146e1e25b183d1e1fd8afd864025c823 - Sigstore transparency entry: 443971451
- Sigstore integration time:
-
Permalink:
quadbio/scembed@883695eec1622183f2bf01b0b3a1ee99f0dd2cfd -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/quadbio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@883695eec1622183f2bf01b0b3a1ee99f0dd2cfd -
Trigger Event:
release
-
Statement type: