A method to demultiplex hashtagged single-cell data.
Project description
HTO DND - Demultiplex Hashtag Data
hto is a Python package designed for efficient and accurate demultiplexing of hash-tagged oligonucleotides (HTOs) in single-cell data.
It normalises based on observed background signal and denoises the data to remove batch effects and noise:
- Normalization: Normalize HTO data using background signal, inspired by the DSB method (see citation below).
- Denoising: Remove batch effects and noise from the data by regressing out cell by cell variation.
- Demultiplexing: Cluster and classify cells into singlets, doublets, or negatives using clustering methods like k-means or Gaussian Mixture Models (GMM).
The package supports command-line interface (CLI) usage and Python imports.
Installation
Using pip:
pip install hto
From source:
git clone https://github.com/sail-mskcc/hto_dnd.git
cd hto_dnd
pip install .
Usage
Python API
The python API is built around AnnData. It is highly recommended two work with three AnnData objects:
adata_hto: Filtered AnnData object with HTO data, containing only actual cells.adata_hto_raw: Raw AnnData object with HTO data, containing actual cells and background signal.adata_gex: Raw AnnData object with gene expression data. This is optional and can be used to construct a more informative background signal.
import hto
# get mockdata
mockdata = hto.data.generate_hto(n_cells=1000, n_htos=3, seed=10)
adata_hto = mockdata["filtered"]
adata_hto_raw = mockdata["raw"]
adata_gex = mockdata["gex"]
# denoise, normalize, and demultiplex
adata_demux = hto.demultiplex(
adata_hto,
adata_hto_raw,
adata_gex=adata_gex,
)
# see results
adata_demux.obs[["hash_id", "doublet_info"]].head()
Command-Line Interface (CLI)
The CLI provides an API for the hto demultiplex scripts. Make sure to define --adata-out to save the output.
hto demultiplex \
--adata-hto /path/to/adata_hto.h5ad \
--adata-hto-raw /path/to/adata_hto_raw.h5ad \
--adata-gex /path/to/adata_gex.h5ad \
--adata-out /path/to/output.h5ad
Data Requirements
HTO-DND requires data from cell hashing experiments where samples are labeled with hashtagged antibodies:
- HTO data (
adata_hto): Filtered cell × HTO count matrix in AnnData format. - Raw HTO data (
adata_hto_raw): Unfiltered barcode × HTO count matrix including empty droplets. Required for background estimation. - Gene expression data (
adata_gex, recommended): Cell × gene count matrix for improved background estimation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hto-1.1.7a0.tar.gz.
File metadata
- Download URL: hto-1.1.7a0.tar.gz
- Upload date:
- Size: 41.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.9 Linux/4.18.0-425.19.2.el8_7.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61532260151e7215ed76e976411ac41f03293799c7e52e219f43aa707ea6e61a
|
|
| MD5 |
4a87906527a738a67769f144de6fe307
|
|
| BLAKE2b-256 |
81d94340add75e6739c486b997b285707bed4e5b79c490a399d0117ba276e3aa
|
File details
Details for the file hto-1.1.7a0-py3-none-any.whl.
File metadata
- Download URL: hto-1.1.7a0-py3-none-any.whl
- Upload date:
- Size: 53.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.9 Linux/4.18.0-425.19.2.el8_7.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4204bd572b83302cbe17c0cec51dc2a6723e3b4052da0c0890768e5a28d4f6f1
|
|
| MD5 |
8d0d286da6244a30971b749db56b62dd
|
|
| BLAKE2b-256 |
d59e37a36edf9335e8d751de17f18bd209d1f3648b77c70ace2b74b9389b34e0
|