Skip to main content

An alignment-free deep-learning model trained to classify human gut bacteria

Project description

Xlassify

Fast and accurate taxonomic classification of bacteria genomes is a key step in human gut microbiome analysis. Here we propose Xlassify, an alignment-free deep-learning model that is specifically trained to classify human gut bacteria.

Xlassify demonstrated 98% accuracy in UHGG genomes dataset and ~90% accuracy on an independent testset of 76 gut bacterial genomes isolated from healthy Chinese individuals. Better than alignment-based methods such as GTDBTk, Xlassify requires only <4GB of memory and reaches thirty-second-per-genome speed on a single CPU.

Architecture

16S model: 16s_model

genome model: genome_model

Installation

We provide three ways to install Xlassify locally via pip, conda or Docker.

From pip:

pip install Xlassify

From conda:

conda install -c ai4drug Xlassify

From Docker:

docker pull SenseTime-Knowledge-Mining/Xlassify

Usage

usage: xlassify [-h] [-m MODEL_NAME] [-i INPUT_PATH]
                [-f INPUT_FILE_LST [INPUT_FILE_LST ...]] [-s SAVE_PATH]
                [-r SAVE_FILE] [--save_kmer SAVE_KMER] [-b BATCH] [-k K]
                [--nproc NPROC]

optional arguments:
  -h, --help            show this help message and exit
  -m MODEL_NAME, --model_name MODEL_NAME
                        Choose a model from {compute_kmer, species_genome,
                        genus_full, species_full}. Default: species_genome
  -i INPUT_PATH, --input_path INPUT_PATH
                        The path of input fasta file. Using testing data as
                        default.
  -f INPUT_FILE_LST [INPUT_FILE_LST ...], --input_file_lst INPUT_FILE_LST [INPUT_FILE_LST ...]
                        The list of input file.
  -s SAVE_PATH, --save_path SAVE_PATH
                        The path of save file. Default: ./Xlassify_results
  -r SAVE_FILE, --save_file SAVE_FILE
                        The path of results file. Default: res.csv
  --save_kmer SAVE_KMER
                        Save kmer or not {0,1}. Default: 1
  -b BATCH, --batch BATCH
                        The batch of prediction.
  -k K                  The k of kmer. Default: 7
  --nproc NPROC         The number of CPUs to use. Default: 1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Xlassify-1.0.0.tar.gz (18.5 MB view details)

Uploaded Source

Built Distribution

Xlassify-1.0.0-py3-none-any.whl (18.5 MB view details)

Uploaded Python 3

File details

Details for the file Xlassify-1.0.0.tar.gz.

File metadata

  • Download URL: Xlassify-1.0.0.tar.gz
  • Upload date:
  • Size: 18.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for Xlassify-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0f18f97147053aa4f871f6239c21b9bf1fce8ad0b7a318468006b98afd007401
MD5 84cbd6bcb178ae4546e857c38501811d
BLAKE2b-256 0a150ceb18841076a00f1613e9ecee70c1967e40fa8a0681199bcc03e6c66773

See more details on using hashes here.

File details

Details for the file Xlassify-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: Xlassify-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 18.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.8.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for Xlassify-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 79863d52bfd57976870374006da614a33ab3fbd1cca4e4c738f8417855a199d9
MD5 571dc3ffbcdc25ae335114500f32a733
BLAKE2b-256 324dde8e5ea542a24cbd0d29195fc49aed3759acdd2ca6d26e5b49ad959deebc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page