Skip to main content

Run embedding comparisons for single-cell data

Project description

🧬 Comparing embeddings for single-cell and spatial data

Tests Documentation Coverage Pre-commit.ci PyPI Downloads Zenodo

Single-cell RNA-sequencing (scRNA-seq) 🧪 measures gene expression in individual cells and generates large datasets. Typically, these datasets consist of several samples, each corresponding to a combination of covariates (e.g. patient, time point, disease status, technology, etc.). Analyzing these vast datasets (often containing millions of cells for thousands of genes) is facilitated by data integration approaches, which learn lower-dimensional representations that remove the effects of certain unwanted covariates (such as experimental batch, the chip the data was run on, etc).

🎯 Overview

Here, we use slurm_sweep to efficiently parallelize and track different data integration approaches, and we compare their performance in terms of scIB metrics (Luecken et al., 2022). For each data integration method, we compute a shared latent space, quantify integration performance in terms of batch correction and bio conservation, visualize the latent space with UMAP, store the model and embedding coordinates, and store all relevant data on wandb, so that we can retrieve it after the sweep.

scembed consists of shallow wrappers around commonly used integration tools, a class to facilitate scIB comparisons, and another class to retrieve and aggregate sweep results.

🚀 Getting started

Please refer to the documentation, in particular, the API documentation.

📦 Installation

You need to have Python 3.10 or newer installed on your system. If you don't have Python installed, we recommend installing uv.

There are several alternative options to install scembed:

  1. Install the latest release of scembed from PyPI:
pip install scembed
  1. Install the latest development version:
pip install git+https://github.com/quadbio/scembed.git@main

🎯 Dependency Groups

The package uses optional dependency groups to minimize installation overhead:

  • Base: Core functionality (scanpy, scib-metrics, wandb)
  • [cpu]: CPU-based methods (e.g. Harmony, LIGER, Scanorama)
  • [gpu]: GPU-based methods (e.g. scVI, scANVI, scPoli)
  • [fast_metrics]: Accelerated evaluation with faiss and RAPIDS
  • [all]: All optional dependencies

⚠️ Note: If you encounter C++ compilation errors (e.g., with louvain or annoy), install those packages via conda/mamba first:

mamba install louvain python-annoy

📝 Release notes

See the changelog.

💬 Contact

For questions and help requests, you can reach out in the scverse discourse. If you found a bug, please use the issue tracker.

📖 Citation

Please use our zenodo entry to cite this software.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scembed-0.1.0.tar.gz (344.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scembed-0.1.0-py3-none-any.whl (35.6 kB view details)

Uploaded Python 3

File details

Details for the file scembed-0.1.0.tar.gz.

File metadata

  • Download URL: scembed-0.1.0.tar.gz
  • Upload date:
  • Size: 344.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for scembed-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2eb550ff4f16dec77c64328923217544905afc97322cbd3a7ecdf6f93e805d7d
MD5 56ae99002c897c33db836412107b00cb
BLAKE2b-256 10279f5ed42dfe0895331a7a6ba0675cc8d13f258d7f63fe6e5aabb600e0e4d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for scembed-0.1.0.tar.gz:

Publisher: release.yaml on quadbio/scembed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scembed-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: scembed-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 35.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for scembed-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c8dd102cfe06b9b83c070b93bc1053cbda083bc11bb836a0eb7fd1f8d51979cb
MD5 8ed641c441ed4b94c7f5ae0c70149746
BLAKE2b-256 9913b94667bb27212372406b2e0d0b7a0a8fc813a0d1f89da6169552644ccf63

See more details on using hashes here.

Provenance

The following attestation bundles were made for scembed-0.1.0-py3-none-any.whl:

Publisher: release.yaml on quadbio/scembed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page