k-NN-based mapping of cells across representations
Project description
CellMapper
k-NN-based mapping of cells across representations to transfer labels, embeddings and expression values. Works for millions of cells, on CPU and GPU, across molecular modalities, between spatial and non-spatial data, for arbitrary query and reference datasets. Using faiss to compute k-NN graphs, CellMapper takes about 30 seconds to transfer cell type labels from 1.5M cells to 1.5M cells on a single RTX 4090 with 60 GB CPU memory.
Inspired by previous tools, including scanpy's ingest and the HNOCA-tools packages. Check out the 📚 docs to learn more, in particular our tutorials.
✨ Key use cases
- 🧬 Transfer cell type labels and expression values from dissociated to spatial datasets.
- ↔️ Transfer embeddings between arbitrary query and reference datasets.
- 📊 Compute presence scores for query datasets in large reference atlasses.
- 🗺️ Identify niches in spatial datasets by contextualizing latent spaces in spatial coordinates.
- 📈 Evaluate the results of transferring labels, embeddings and feature spaces using a variety of metrics.
The core idea of CellMapper is to separate the method (k-NN graph with some kernel applied to get a mapping matrix) from the application (mapping across arbitrary representations), to be flexible and fast. The tool currently supports pynndescent, sklearn, faiss and rapids for neighborhood search, implements a variety of graph kernels, and is closely integrated with AnnData objects.
📦 Installation
You need to have 🐍 Python 3.11 or newer installed on your system. If you don't have Python installed, we recommend installing uv.
There are two alternative options to install cellmapper:
-
🚀 Install the latest release from PyPI:
pip install cellmapper
-
🛠️ Install the latest development version:
pip install git+https://github.com/quadbio/cellmapper.git@main
🏁 Getting started
This package assumes that you have query and reference AnnData objects, with a joint embedding computed and stored in .obsm. While we implement some baseline approaches to compute joint embeddings (PCA and a fast reimplementation of CCA), we typically expect you to provide a pre-computed joint embedding from some task-specific representation learning tools, e.g. GimVI or ENVI for spatial mapping, GLUE, MIDAS and MOFA+ for modality translation, and scVI, scANVI and scArches for query-to-reference mapping - this is just a small selection!
With a joint embedding in .obsm["X_joint"] at hand, the simplest way to use CellMapper is as follows:
from cellmapper import CellMapper
cmap = CellMapper(query, reference).map(
use_rep="X_joint", obs_keys="celltype", obsm_keys="X_umap", layer_key="X"
)
This will transfer data from the reference to the query dataset, including celltype labels stored in reference.obs, a UMAP embedding stored in reference.obsm, and expression values stored in reference.X.
There are many ways to customize this, e.g. use different ways to compute k-NN graphs and to turn them into mapping matrices, and we implement a few methods to evaluate whether your k-NN transfer was sucessful. The tool also implements a self-mapping mode (only a query object, no reference), which is useful for spatial contextualization and data denoising. Check out the 📚 docs to learn more.
📝 Release notes
See the changelog.
📬 Contact
If you found a bug, please use the issue tracker.
📖 Citation
Please use our zenodo entry to cite this software.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cellmapper-0.2.5.tar.gz.
File metadata
- Download URL: cellmapper-0.2.5.tar.gz
- Upload date:
- Size: 10.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3af52a6dc4627a85ce71b29673df2e30fd72a0136ac5cfcd865413fcd6584a58
|
|
| MD5 |
7d21f2de9317fa7d7dd2ae4cf338ab63
|
|
| BLAKE2b-256 |
1165e049e796884e9c10925dbdc3f86e3e9d5c07a5296309225d4d662ae302ce
|
Provenance
The following attestation bundles were made for cellmapper-0.2.5.tar.gz:
Publisher:
release.yaml on quadbio/cellmapper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cellmapper-0.2.5.tar.gz -
Subject digest:
3af52a6dc4627a85ce71b29673df2e30fd72a0136ac5cfcd865413fcd6584a58 - Sigstore transparency entry: 831652564
- Sigstore integration time:
-
Permalink:
quadbio/cellmapper@a3fdfe9fdc816302a4e0573381ae5fec4849b118 -
Branch / Tag:
refs/tags/v0.2.5 - Owner: https://github.com/quadbio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@a3fdfe9fdc816302a4e0573381ae5fec4849b118 -
Trigger Event:
release
-
Statement type:
File details
Details for the file cellmapper-0.2.5-py3-none-any.whl.
File metadata
- Download URL: cellmapper-0.2.5-py3-none-any.whl
- Upload date:
- Size: 49.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da442394ce83f3455a9596809309514817c0cb34acc38f244a4ca61ef722a24b
|
|
| MD5 |
61a93c6b5ca2a611701bc2f425a058a0
|
|
| BLAKE2b-256 |
0d262e08b2f664d58499dc32fb8d82f305593b2bd66af35ed9ebbb25eb1d54a6
|
Provenance
The following attestation bundles were made for cellmapper-0.2.5-py3-none-any.whl:
Publisher:
release.yaml on quadbio/cellmapper
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cellmapper-0.2.5-py3-none-any.whl -
Subject digest:
da442394ce83f3455a9596809309514817c0cb34acc38f244a4ca61ef722a24b - Sigstore transparency entry: 831652575
- Sigstore integration time:
-
Permalink:
quadbio/cellmapper@a3fdfe9fdc816302a4e0573381ae5fec4849b118 -
Branch / Tag:
refs/tags/v0.2.5 - Owner: https://github.com/quadbio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@a3fdfe9fdc816302a4e0573381ae5fec4849b118 -
Trigger Event:
release
-
Statement type: