Ensemble Multiple References for Single-cell RNA Seuquencing Data Annotation and Unseen Cells Identification
Project description
Ensemble of multiple references for single-cell RNA sequencing data annotation and unseen cell-type identification
mtANN is a novel cell-type annotation framework that integrates ensemble learning and deep learning simultaneously, to annotate cells in a new query dataset with the help of multiple well-labeled reference datasets. It takes multiple well-labeled reference datasets and a query dataset that needs to be annotated as input. It begins with generating a series of subsets for each reference dataset by adopting various gene selection methods. Next, for each reference subset, a base classification model is trained based on neural networks. Then, mtANN annotates the cells in the query dataset by integrating the prediction results from all the base classification models. Finally, it identifies cells that may belong to cell types not observed in the reference datasets according to the uncertainty of the predictions.
System Requirements
Python support packages: pandas, numpy, scanpy, scipy, sklearn, torch, giniclust3, rpy2
R support packages: limma, Seurat, parallel
Versions the software has been tested on
Environment 1
- System: Ubuntu 18.04.5
- Python: 3.8.8
- Python packages: pandas = 1.2.3, numpy = 1.19,2, scanpy=1.9.0, scipy = 1.6.1, sklearn = 0.24.1, torch = 1.9.1, giniclust3 = 1.1.0, rpy2 = 3.5.2
- R: 3.6.1
- R packages: limma = 3.42.2, Seurat = 3.1.1, parallel = 3.6.1
Environment 2
- System: Windows 10
- Python: 3.7.6
- Python packages: pandas = 1.3.5, numpy = 1.21.6, scanpy=1.9.3, scipy = 1.7.3, scikit-learn = 1.0.2, torch = 1.13.0, giniclust3 = 1.1.2, rpy2 = 3.5.11
- R: 4.1.2
- R packages: limma = 3.50.3, Seurat = 4.2.0, parallel = 4.1.2
Installation
pip install mtANN==1.0
To successfully use mtANN, please ensure that R is correctly installed and added to the environment variables. Additionally, you need to add a new user variable named R_USER
that points to the installation path of the Python package rpy2
.
Useage
The mtANN repository includes the mtANN code files in the mtANN
folder and provides a usage example example
which specifically shows the format of the input data and the usage of the main function. The data used in the example can be downloaded at https://doi.org/10.5281/zenodo.7922657.
The input data considered by the current version of mtANN is in csv
format, where rows are samples and columns are features. In addition, its cell type information is stored in another csv file, and its naming format is the name of the dataset + _label
.
Contact
Please do not hesitate to contact Miss Yi-Xuan Xiong (xyxuana@mails.ccnu.edu.cn) or Dr. Xiao-Fei Zhang (zhangxf@ccnu.edu.cn) to seek any clarifications regarding any contents or operation of the archive.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mtANN-1.0.tar.gz
.
File metadata
- Download URL: mtANN-1.0.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d073b0da059609576fb74e8d0b94bc0ed2dc1d035f68b8908b5380fd30de2936 |
|
MD5 | 97fea533b240eb58be49ffa842d4f606 |
|
BLAKE2b-256 | a8530cb625b6111202c9964e6ece9b7a8da3c71a78f1a6dad44f14ea7e579a35 |
File details
Details for the file mtANN-1.0-py3-none-any.whl
.
File metadata
- Download URL: mtANN-1.0-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1717fc2a693b3992b43a2114056c39ccaace7ef6bce4e154e828adc166eafb59 |
|
MD5 | 187fc3b61f4b177cdc368710075c2a33 |
|
BLAKE2b-256 | 57e0468f973e360a81aaaf8e53ccb97ec05a270f0bd1c5bec7ad6eab52dd9324 |