An alignment-free deep-learning model trained to classify human gut bacteria
Project description
Xlassify
Fast and accurate taxonomic classification of bacteria genomes is a key step in human gut microbiome analysis. Here we propose Xlassify, an alignment-free deep-learning model that is specifically trained to classify human gut bacteria.
Xlassify demonstrated 98% accuracy in UHGG genomes dataset and ~90% accuracy on an independent testset of 76 gut bacterial genomes isolated from healthy Chinese individuals. Better than alignment-based methods such as GTDBTk, Xlassify requires only <4GB of memory and reaches thirty-second-per-genome speed on a single CPU.
Architecture
16S model:
genome model:
Installation
We provide three ways to install Xlassify locally via pip, conda or Docker.
From pip:
pip install Xlassify
From conda:
conda install -c ai4drug Xlassify
From Docker:
docker pull SenseTime-Knowledge-Mining/Xlassify
Usage
usage: xlassify [-h] [-m MODEL_NAME] [-i INPUT_PATH]
[-f INPUT_FILE_LST [INPUT_FILE_LST ...]] [-s SAVE_PATH]
[-r SAVE_FILE] [--save_kmer SAVE_KMER] [-b BATCH] [-k K]
[--nproc NPROC]
optional arguments:
-h, --help show this help message and exit
-m MODEL_NAME, --model_name MODEL_NAME
Choose a model from {compute_kmer, species_genome,
genus_full, species_full}. Default: species_genome
-i INPUT_PATH, --input_path INPUT_PATH
The path of input fasta file. Using testing data as
default.
-f INPUT_FILE_LST [INPUT_FILE_LST ...], --input_file_lst INPUT_FILE_LST [INPUT_FILE_LST ...]
The list of input file.
-s SAVE_PATH, --save_path SAVE_PATH
The path of save file. Default: ./Xlassify_results
-r SAVE_FILE, --save_file SAVE_FILE
The path of results file. Default: res.csv
--save_kmer SAVE_KMER
Save kmer or not {0,1}. Default: 1
-b BATCH, --batch BATCH
The batch of prediction.
-k K The k of kmer. Default: 7
--nproc NPROC The number of CPUs to use. Default: 1
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file Xlassify-1.0.0.tar.gz
.
File metadata
- Download URL: Xlassify-1.0.0.tar.gz
- Upload date:
- Size: 18.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f18f97147053aa4f871f6239c21b9bf1fce8ad0b7a318468006b98afd007401 |
|
MD5 | 84cbd6bcb178ae4546e857c38501811d |
|
BLAKE2b-256 | 0a150ceb18841076a00f1613e9ecee70c1967e40fa8a0681199bcc03e6c66773 |
File details
Details for the file Xlassify-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: Xlassify-1.0.0-py3-none-any.whl
- Upload date:
- Size: 18.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79863d52bfd57976870374006da614a33ab3fbd1cca4e4c738f8417855a199d9 |
|
MD5 | 571dc3ffbcdc25ae335114500f32a733 |
|
BLAKE2b-256 | 324dde8e5ea542a24cbd0d29195fc49aed3759acdd2ca6d26e5b49ad959deebc |