Python bindings to the singleR algorithm to annotate cell types from known references.
Project description
Tinder for single-cell data
Overview
This package provides Python bindings to the C++ implementation of the SingleR algorithm, originally developed by Aran et al. (2019). It is designed to annotate cell types by matching cells to known references based on their expression profiles. So kind of like Tinder, but for cells.
Quick start
Firstly, let's load in the famous PBMC 4k dataset from 10X Genomics:
import singlecellexperiment as sce
data = sce.read_tenx_h5("pbmc4k-tenx.h5", realize_assays=True)
mat = data.assay("counts")
features = [str(x) for x in data.row_data["name"]]
or if you are coming from scverse ecosystem, i.e. AnnData
, simply read the object as SingleCellExperiment
and extract the matrix and the features.
Read more on SingleCellExperiment here.
import singlecellexperiment as sce
sce_adata = sce.SingleCellExperiment.from_anndata(adata)
# or from a h5ad file
sce_h5ad = sce.read_h5ad("tests/data/adata.h5ad")
Now, we fetch the Blueprint/ENCODE reference:
import celldex
ref_data = celldex.fetch_reference("blueprint_encode", "2024-02-26", realize_assays=True)
We can annotate each cell in mat
with the reference:
import singler
results = singler.annotate_single(
test_data = mat,
test_features = features,
ref_data = ref_data,
ref_labels = "label.main",
)
The results
data frame contains all of the assignments and the scores for each label:
results.column("best")
## ['Monocytes',
## 'Monocytes',
## 'Monocytes',
## 'CD8+ T-cells',
## 'CD4+ T-cells',
## 'CD8+ T-cells',
## 'Monocytes',
## 'Monocytes',
## 'B-cells',
## ...
## ]
results.column("scores").column("Macrophages")
## array([0.35935275, 0.40833545, 0.37430726, ..., 0.32135929, 0.29728435,
## 0.40208581])
Calling low-level functions
The annotate_single()
function is a convenient wrapper around a number of lower-level functions in singler.
Advanced users may prefer to build the reference and run the classification separately.
This allows us to re-use the same reference for multiple datasets without repeating the build step.
built = singler.build_single_reference(
ref_data=ref_data.assay("logcounts"),
ref_labels=ref_data.col_data.column("label.main"),
ref_features=ref_data.get_row_names(),
restrict_to=features,
)
And finally, we apply the pre-built reference to the test dataset to obtain our label assignments.
This can be repeated with different datasets that have the same features or a superset of features
.
output = singler.classify_single_reference(
mat,
test_features=features,
ref_prebuilt=built,
)
## output
BiocFrame with 4340 rows and 3 columns
best scores delta
<list> <BiocFrame> <ndarray[float64]>
[0] Monocytes 0.33265560369962943:0.407117403330602... 0.40706830113982534
[1] Monocytes 0.4078771641637374:0.4783396310685646... 0.07000418564184802
[2] Monocytes 0.3517036021728629:0.4076971245524348... 0.30997293412307647
... ... ...
[4337] NK cells 0.3472631136865701:0.3937898240670208... 0.09640242155786138
[4338] B-cells 0.26974632191999887:0.334862058137758... 0.061215905058676856
[4339] Monocytes 0.39390119034537324:0.468867490667427... 0.06678168346812047
Integrating labels across references
We can use annotations from multiple references through the annotate_integrated()
function:
import singler
import celldex
blueprint_ref = celldex.fetch_reference("blueprint_encode", "2024-02-26", realize_assays=True)
immune_cell_ref = celldex.fetch_reference("dice", "2024-02-26", realize_assays=True)
single_results, integrated = singler.annotate_integrated(
mat,
features,
ref_data_list = (blueprint_ref, immune_cell_ref),
ref_labels_list = "label.main",
num_threads = 6
)
This annotates the test dataset against each reference individually to obtain the best per-reference label, and then it compares across references to find the best label from all references. Both the single and integrated annotations are reported for diagnostics.
integrated.column("best_label")
## ['Monocytes',
## 'Monocytes',
## 'Monocytes',
## 'CD8+ T-cells',
## 'CD4+ T-cells',
## 'CD8+ T-cells',
## 'Monocytes',
## 'Monocytes',
## ...
## ]
integrated.column("best_reference")
## ['Blueprint',
## 'Blueprint',
## 'Blueprint',
## 'Blueprint',
## 'Blueprint',
## 'Blueprint',
## 'Blueprint',
## ...
##]
Developer notes
Build the shared object file:
python setup.py build_ext --inplace
For quick testing:
pytest
For more complex testing:
python setup.py build_ext --inplace && tox
To rebuild the ctypes bindings with cpptypes:
cpptypes src/singler/lib --py src/singler/_cpphelpers.py --cpp src/singler/lib/bindings.cpp --dll _core
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file singler-0.3.0.tar.gz
.
File metadata
- Download URL: singler-0.3.0.tar.gz
- Upload date:
- Size: 44.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d32674b9047c709df78e81fdf2f4f99a2f25150341b178f73cb4b6568f4c6d43 |
|
MD5 | 520e6f637c2b77145cb074303c6a23bb |
|
BLAKE2b-256 | 41e36c4bab922f58e19ac6c75de422dd0612a907766d4fdc151e533ff9ff3477 |
File details
Details for the file singler-0.3.0-cp311-cp311-musllinux_1_1_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp311-cp311-musllinux_1_1_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.11, musllinux: musl 1.1+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e4277171896ce83f5f084aff02171e2ee59882128604de196841bc0a6af666a |
|
MD5 | f89f85cfb2754795b8d8718d1f7fe63c |
|
BLAKE2b-256 | 176c2f0c882d5451dc44ce85d27ba46386c7ca5e5efb4d84783313d06e94c39d |
File details
Details for the file singler-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 2.6 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7771c25a192073e6e962a3a2787a96a9eb75aa4f11336dfc639b6cadf720ff4 |
|
MD5 | 1002bcaa3062429d7b41e37959f930d8 |
|
BLAKE2b-256 | 54a2062695cb4bb0aeff2f6a58c83ce67f2cdffbc9345a7a8e3ca9672b4fb6de |
File details
Details for the file singler-0.3.0-cp311-cp311-macosx_11_0_arm64.whl
.
File metadata
- Download URL: singler-0.3.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 123.4 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ce23a866864495dee6a568a1cf2a81948eb7fa8896e9dbf53b75e0c235df071 |
|
MD5 | 4fcdc462947374c2853f8cd8264e2ab4 |
|
BLAKE2b-256 | 393cec799077ba73467bdf3d552bb87ff92b871b03d6b77ab377c5de485f664d |
File details
Details for the file singler-0.3.0-cp311-cp311-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp311-cp311-macosx_10_9_x86_64.whl
- Upload date:
- Size: 140.4 kB
- Tags: CPython 3.11, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3c093e814bc255a23578a3c79ffded38329aac5a003e3b3af40350f0c76e361 |
|
MD5 | 1eef209cd62b19fef957e6dfc8830309 |
|
BLAKE2b-256 | 9715fbb297139a9fb54abbee6bf20173af8d1413b77c6d433cec2d6fb393ba59 |
File details
Details for the file singler-0.3.0-cp310-cp310-musllinux_1_1_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp310-cp310-musllinux_1_1_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.10, musllinux: musl 1.1+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 697144764ea78d78c72641c79d4ec90ce1837bb23ad9b6320f9bf50c297913ba |
|
MD5 | 58ae2ecc6ee550e96035282cccf4501c |
|
BLAKE2b-256 | 7549044317f466cb62d7d8b404554992b4fb6d8e38e43caa2adb2f272341c65f |
File details
Details for the file singler-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 2.6 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd25d3b77191d078381db7f1cc3db95214f26d265d3a281314be8fe8e24c43e7 |
|
MD5 | 5ab989d03e028945ac207bf0e4185ca0 |
|
BLAKE2b-256 | c4df4951654e16177bea4b2bcc3d41fa6251b3b03e8c67c470ed8ec3a44b7943 |
File details
Details for the file singler-0.3.0-cp310-cp310-macosx_11_0_arm64.whl
.
File metadata
- Download URL: singler-0.3.0-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 123.4 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f92008b4f566e56cf2c1b5edee8e099b31d27629b0ece4edb98fb8bef484b0e3 |
|
MD5 | d4476344d700b85b37f4ce0773c6f907 |
|
BLAKE2b-256 | 46f9c761579b7a491d8b6a2a25da04a2c788ff07e1e27f89dd007bdac89f679f |
File details
Details for the file singler-0.3.0-cp310-cp310-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp310-cp310-macosx_10_9_x86_64.whl
- Upload date:
- Size: 140.4 kB
- Tags: CPython 3.10, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 477a29a9e55b4dda94817bdd582ab1f15a2f4366f56bc020566174f1011d177a |
|
MD5 | ac4477198c7080f4c952389b10c94066 |
|
BLAKE2b-256 | b30b6df176d1d190a0b3730b5a263ae4f5143844d2929840c8b01bc31ec975f2 |
File details
Details for the file singler-0.3.0-cp39-cp39-musllinux_1_1_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp39-cp39-musllinux_1_1_x86_64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.9, musllinux: musl 1.1+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4e9cdde3b995b75ef524b32b27e4d3a15944288abd7059042df9551c83f96b9 |
|
MD5 | 4dc723617b36757310eeb8f2ff6b3472 |
|
BLAKE2b-256 | 023f7a701cf55e97f66a6edeed1dd7f80d38f3d1b241039d40733fb06a7486a9 |
File details
Details for the file singler-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 2.6 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ceda4a1cc746afd0c8570d478f2d1b2291c7a0d9b691ab30ffff7e1ef4d27564 |
|
MD5 | 796610a8c52b9d65af7f5226568513f9 |
|
BLAKE2b-256 | 52aeee46eb92825d8b92f12a94040d3b282317c4818c48f6bf542fba02e7ff6f |
File details
Details for the file singler-0.3.0-cp39-cp39-macosx_11_0_arm64.whl
.
File metadata
- Download URL: singler-0.3.0-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 123.4 kB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0270aa31b2a04cddbf41f460ce4c7cdd7fd0400cac67b5cc04e49917a93ad6ea |
|
MD5 | 9b36a74334e309d361e32dd1d2861842 |
|
BLAKE2b-256 | 80099cc4fb8b8ad51135b2941c3c42c5c0134f026d7be45404949bfa748f0193 |
File details
Details for the file singler-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: singler-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl
- Upload date:
- Size: 140.4 kB
- Tags: CPython 3.9, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c546eca0008cf19a2830bc81c1733784e2c037d98fbc9c975cca19c3eee96c3 |
|
MD5 | 6935fe72a2e44168f251a59f4685953e |
|
BLAKE2b-256 | 72e8acf31c88dbdfde6a802d84446544c31e730b8da0e0ba8eb99bddff937d70 |