running single cell analysis on Nvidia GPUs

Project description

rapids-singlecell

Background

This repository offers some tools to make analyses of single cell datasets faster by running them on the GPU. The functions are analogous versions of functions that can be found within scanpy from the Theis lab or functions from rapids-single-cell-examples created by the Nvidia RAPIDS team. Most functions are kept close to the original code to ensure compatibility. My aim with this repository was to use the speedup that GPU computing offers and combine it with the ease of use from scanpy.

Installation

Conda

The easiest way to install rapids-singlecell is to use one of the yaml file provided in the conda folder. These yaml files install everything needed to run the example notbooks and get you started.

conda env create -f conda/rsc_rapids_22.12.yml
# or
mamba env create -f conda/rsc_rapids_23.02a.yml

PyPI

As of version 0.4.0 rapids-singlecell is now on PyPI.

pip install rapids-singlecell

The default installer doesn't cover RAPIDS nor cupy. Information on how to install RAPIDS & cupy can be found here.

If you want to use RAPIDS new PyPI packages, the whole library with all dependencies can be install with:

pip install 'rapids-singlecell[rapids]’ --extra-index-url=https://pypi.ngc.nvidia.com

Please note that the RAPIDS PyPI packages are still considered experimental. It is important to ensure that the CUDA environment is set up correctly so that RAPIDS and Cupy can locate the necessary libraries.

To view a full guide how to set up a fully functioned single cell GPU accelerated conda environment visit GPU_SingleCell_Setup

Citation

If you use this code, please cite:

Please also consider citing: rapids-single-cell-examples and scanpy

In addition to that please cite the methods' original research articles in the scanpy documentation

If you use the accelerated decoupler functions please cite decoupler

Functionality

cunnData

The preprocessing of the single-cell data is performed with cunnData. It is a replacement for the AnnData object used by scanpy. The cunnData object is a cutdown version of an AnnData object. At its core lies a sparse matrix (.X) within the GPU memory. .obs and .var are pandas data frame and .uns is a dictionary. It also supports .layers, .varm and .obsm. .layers are stored on the GPU, while .obsm and .varm are not. Since version 0.3.0 you can use cunnData for spatial transcriptomics datasets.
cunnData includes methods for:

__getiem__ to filter the object based on .obs and .var.
__repr__
transform cunnData object to AnnData object

cunnData_funcs or pp

Most preprocessing functions of scanpy are reimplemented for the cunnData class. I tried to keep the input as close to the original scanpy implementation as possible. Please have look at the notebooks to assess the functionality. I tried to write informative docstrings for each function.
cunnData_funcs includes functions for:

filter genes based on cells expressing that genes
filter cells based on a multitude of parameters (eg. number of expressed genes, mitchondrial content)
caluclate_qc (based on scanpy's pp.calculate_qc_metrics)
normalize_total
normalize based on pearson_residuals
log1p
highly_variable_genes
- seurat
- cellranger
- seurat_v3
- pearson_residuals
- poisson_gene_selection (adapted from scvi)
regress_out
scale
PCA (PCA/ incremental PCA/ truncated svd)
some plotting functions of qc parameters

scanpy_gpu or tl

scanpy_gpu are functions that are written to directly work with an AnnData object and replace the scanpy counterpart by running on the GPU. Scanpy already supports GPU versions of pp.neighbors and tl.umap using RAPIDS.
scanpy_gpu includes additional functions for:

PCA (PCA/ incremental PCA/ truncated svd)
Leiden Clustering
Louvain Clustering
TSNE
Kmeans Clustering
Kernel Density
Harmony Integration (gpu port of harmonypy)
Diffusion Maps
PyMDE (adapted from scvi)
Force Atlas 2 (draw_grah)
rank_genes_groups with logistic regression

decoupler_gpu

Decoupler is an amazing toolkit, that contains different statistical methods to extract biological activities from omics data within a unified framework. So far I have reimplemented run_mlm and run_wsum to run on the GPU. As always I tried to keep the syntax as close the original as possible. decoupler_gpu also works with the same models as decoupler. For a closer looks please check out the demo_gpu.ipynb in notebooks.
decoupler_gpu includes additional functions for:

run_mlm
run_wsum

Notebooks

To show the capability of these functions, I created two example notebooks evaluating the same workflow running on the CPU and GPU. These notebooks should run in the environment, that is described in Requirements. First, run the data_downloader notebook to create the AnnData object for the analysis. If you run both demo_cpu and demo_gpu you should see a big speedup when running the analyses on the GPU.

Benchmarks

Here are some benchmarks. I ran the notebook on the CPU with as many cores as were available where possible.

Step	CPU (Ryzen 5950x, 32 Cores, 64GB RAM)	GPU (RTX 3090)	CPU (AMD Eypc Rome, 30 Cores, 500GB RAM)	GPU (Quadro RTX 6000)	GPU (A100 80GB)
whole Notebook	728 s	43 s	917 s	67 s	57 s
Preprocessing	75 s	21 s	40 s	34 s	30 s
Clustering and Visulatization	423 s	18 s	524 s	27 s	21 s
Normalize_total	252 ms	> 1ms	425 ms	1 ms	1 ms
Highly Variable Genes	3.2 s	2.6 s	4.1 s	2.7 s	3.7 s
Regress_out	63 s	2 s	24 s	2 s	2 s
Scale	1.3 s	299 ms	2 s	2 s	359 ms
PCA	26 s	1.8 s	23 s	3.6 s	2.6 s
Neighbors	10 s	5 s	16.8 s	8.1 s	6 s
UMAP	30 s	659 ms	66 s	1 s	783 ms
Louvain	16 s	121 ms	20 s	214 ms	201 ms
Leiden	11 s	102 ms	20 s	175 ms	152 ms
TSNE	240 s	1.4 s	319 s	1.8 s	1.4 s
Logistic_Regression	74 s	4 s	45 s	5 s	3.4 s
Diffusion Map	715 ms	259 ms	747 ms	431 ms	826 ms
Force Atlas 2	207 s	236 ms	300 s	298 ms	353 ms

I also observed that the first GPU run in a new enviroment is slower than the runs after that (with a restarted kernel) (RTX 6000).

Project details

Release history Release notifications | RSS feed

0.15.0rc3 pre-release

Feb 19, 2026

0.15.0rc2 pre-release yanked

Feb 19, 2026

0.15.0rc1 pre-release yanked

Feb 19, 2026

0.14.1

Feb 13, 2026

0.14.0

Feb 2, 2026

0.13.5

Dec 12, 2025

0.13.4

Nov 12, 2025

0.13.3

Oct 21, 2025

0.13.2

Sep 2, 2025

0.13.1

Aug 13, 2025

0.13.0

Aug 6, 2025

0.12.7

Jun 10, 2025

0.12.6

May 27, 2025

0.12.5

May 23, 2025

0.12.4

May 7, 2025

0.12.3

Apr 11, 2025

0.12.2

Apr 11, 2025

0.12.1

Mar 10, 2025

0.12.0

Mar 3, 2025

0.11.1

Jan 21, 2025

0.11.0

Dec 19, 2024

0.10.11

Nov 12, 2024

0.10.10

Oct 8, 2024

0.10.9

Oct 1, 2024

0.10.8

Aug 9, 2024

0.10.7

Jul 23, 2024

0.10.6

Jun 25, 2024

0.10.5

Jun 11, 2024

0.10.4

May 13, 2024

0.10.3

May 13, 2024

0.10.2

Apr 15, 2024

0.10.1

Apr 3, 2024

0.10.0

Mar 26, 2024

0.9.6

Feb 23, 2024

0.9.5

Jan 3, 2024

0.9.4

Jan 3, 2024

0.9.3

Nov 14, 2023

0.9.2

Oct 20, 2023

0.9.1

Oct 2, 2023

0.9.0

Sep 9, 2023

0.8.1

Aug 15, 2023

0.8.0

Aug 14, 2023

0.7.5

Aug 4, 2023

0.7.2

Jun 24, 2023

0.7.1

May 31, 2023

0.7.0

May 30, 2023

0.6.3

May 17, 2023

0.6.2

May 10, 2023

0.6.1

Apr 18, 2023

0.6.0

Apr 18, 2023

0.5.1

Feb 27, 2023

0.5.0

Feb 5, 2023

This version

0.4.2

Jan 31, 2023

0.4.1

Jan 30, 2023

0.4.0

Jan 30, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rapids_singlecell-0.4.2.tar.gz (42.1 kB view details)

Uploaded Jan 31, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rapids_singlecell-0.4.2-py3-none-any.whl (49.8 kB view details)

Uploaded Jan 31, 2023 Python 3

File details

Details for the file rapids_singlecell-0.4.2.tar.gz.

File metadata

Download URL: rapids_singlecell-0.4.2.tar.gz
Upload date: Jan 31, 2023
Size: 42.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for rapids_singlecell-0.4.2.tar.gz
Algorithm	Hash digest
SHA256	`1b7debc06e5c44c329e9d7ce9b06f9f7c553164efcb072a57d41923ca03e44e1`
MD5	`011d281f4b937c124fbb37619ea10a8d`
BLAKE2b-256	`84161e53f7e16c98dcafcd331e8a3b3ef0186c931223d0fcb7580fd37d5614bf`

See more details on using hashes here.

File details

Details for the file rapids_singlecell-0.4.2-py3-none-any.whl.

File metadata

Download URL: rapids_singlecell-0.4.2-py3-none-any.whl
Upload date: Jan 31, 2023
Size: 49.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for rapids_singlecell-0.4.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6f9952900ba927af89696df2955332e1a3387ae50f92d6e6b2a04f7fbb3428ac`
MD5	`4ed8262f5dbdc3c4002a5676ba3fa7f2`
BLAKE2b-256	`f013758fe0fa1d5ef6a8e88cee11e711bd81e1b7dc9c179cf71638fcdd11a7e8`

See more details on using hashes here.

rapids-singlecell 0.4.2

Navigation

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Project description

rapids-singlecell

Background

Installation

Conda

PyPI

Citation

Functionality

cunnData

cunnData_funcs or pp

scanpy_gpu or tl

decoupler_gpu

Notebooks

Benchmarks

Project details

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes