Skip to main content

k-NN-based mapping of cells across representations

Project description

CellMapper

Tests Coverage Pre-commit.ci PyPI Documentation Downloads Zenodo

k-NN-based mapping of cells across representations to tranfer labels, embeddings and expression values. Works for millions of cells, on CPU and GPU, across molecular modalities, between spatial and non-spatial data, for arbitrary query and reference datasets. Using faiss to compute k-NN graphs, CellMapper takes about 30 seconds to transfer cell type labels from 1.5M cells to 1.5M cells on a single RTX 4090 with 60 GB CPU memory.

Inspired by scanpy's ingest and the HNOCA-tools packages. Check out the docs to learn more, in particular our tutorials.

Key use cases

  • Transfer cell type labels and expression values from dissociated to spatial datasets.
  • Transfer embeddings between arbitrary query and reference datasets.
  • Compute presence scores for query datasets in large reference atlasses.
  • Identify niches in spatial datasets by contextualizing latent spaces in spatial coordinates.
  • Evaluate the results of transferring labels, embeddings and feature spaces using a variety of metrics.

The core idea of CellMapper is to separate the method (k-NN graph with some kernel applied to get a mapping matrix) from the application (mapping across arbitrary representations), to be flexible and fast. The tool currently supports pynndescent, sklearn, faiss and rapids for neighborhood search, implements a variety of graph kernels, and is closely integrated with AnnData objects.

Installation

You need to have Python 3.10 or newer installed on your system. If you don't have Python installed, we recommend installing uv.

There are two alternative options to install cellmapper:

  • Install the latest release from PyPI:

    pip install cellmapper
    
  • Install the latest development version:

    pip install git+https://github.com/quadbio/cellmapper.git@main
    

Getting started

This package assumes that you have query and reference AnnData objects, with a joint embedding computed and stored in .obsm. We explicilty do not compute this joint embedding, but there are plenty of method you can use to get such joint embeddings, e.g. GimVI or ENVI for spatial mapping, GLUE, MIDAS and MOFA+ for modality translation, and scVI, scANVI and scArches for query-to-reference mapping - this is just a small selection!

With a joint embedding in .obsm["X_joint"] at hand, the simplest way to use CellMapper is as follows:

from cellmapper import CellMapper

cmap = CellMapper(query, reference).map(
    use_rep="X_joint", obs_keys="celltype", obsm_keys="X_umap", layer_key="X"
    )

This will transfer data from the reference to the query dataset, including celltype labels stored in reference.obs, a UMAP embedding stored in reference.obsm, and expression values stored in reference.X.

There are many ways to customize this, e.g. use different ways to compute k-NN graphs and to turn them into mapping matrices, and we implement a few methods to evaluate whether your k-NN transfer was sucessful. The tool also implements a self-mapping mode (only a query object, no reference), which is useful for spatial contextualization. Check out the docs to learn more.

Release notes

See the changelog.

Contact

If you found a bug, please use the issue tracker.

Citation

Please use our zenodo entry to cite this software.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cellmapper-0.2.0.tar.gz (11.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cellmapper-0.2.0-py3-none-any.whl (44.2 kB view details)

Uploaded Python 3

File details

Details for the file cellmapper-0.2.0.tar.gz.

File metadata

  • Download URL: cellmapper-0.2.0.tar.gz
  • Upload date:
  • Size: 11.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for cellmapper-0.2.0.tar.gz
Algorithm Hash digest
SHA256 35d3d921e36713ada6a7a5c67cdd7b34dfa0e651c0a068ac746dd122550435a7
MD5 51eedcfbbe56c9b2a718e781997127d1
BLAKE2b-256 2c9de15b3e2e3ca74bd774a24310038e9dbd138e87f09f7e7e1288b1a5ac71f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for cellmapper-0.2.0.tar.gz:

Publisher: release.yaml on quadbio/cellmapper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cellmapper-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: cellmapper-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 44.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for cellmapper-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 65bec54a300b1ab9e6b595bf48c5cc1326f49f4fff0542c991676ed278760fe9
MD5 3bb168b96de1cbc35da9a464a3c21c46
BLAKE2b-256 3b392f7867e224022c3adf1a7526e55f2da2c02162f7f964946c027b1c531af1

See more details on using hashes here.

Provenance

The following attestation bundles were made for cellmapper-0.2.0-py3-none-any.whl:

Publisher: release.yaml on quadbio/cellmapper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page