Skip to main content

Tool to monitor and characterize pathogens using Bloom filters.

Project description

XspecT - Acinetobacter Species Assignment Tool

XspecT is a Python-based tool to taxonomically classify sequence-reads (or assembled genomes) on the species and/or sub-type level using [Bloom Filters](https://en.wikipedia.org/wiki/Bloom_filter) and a [Support Vector Machine](https://en.wikipedia.org/wiki/Support-vector_machine). It also identifies existing [blaOxa-genes](https://en.wikipedia.org/wiki/Beta-lactamase#OXA_beta-lactamases_(class_D)) and provides a list of relevant research papers for further information.

XspecT utilizes the uniqueness of kmers and compares extracted kmers from the input-data to a reference database. Bloom Filter ensure a fast lookup in this process. For a final prediction the results are classified using a Support Vector Machine.

Local extensions of the reference database are supported.

The tool is available as a web-based application and a smaller command line interface.

Installation

To install Xspect, please download the lastest 64 bit Python version and install the package using pip:

pip install xspect

If you would like to train filters yourself, you need to install Jellyfish, which is used to count distinct k-meres in the assemblies. It can be installed using bioconda:

conda install -c bioconda jellyfish

On Apple Silicon, it is possible that this command installs an incorrect Jellyfish package. Please refer to the official Jellyfish project for installation guidance.

Usage

Get the Bloomfilters

To download basic pre-trained filters, you can use the built-in command:

xspect download-filters

Additional species filters can be trained using:

xspect train you-ncbi-genus-name

How to run the web app

Run the following command lines in a console, a browser window will open automatically after the application is fully loaded.

xspect web

How to use the XspecT command line interface

Run xspect with the configuration you want to run it with as arguments.

xspect classify your-genus path/to/your/input-set

For further instructions on how to use the command line interface, execute:

xspect --help

Input Data

XspecT is able to use either raw sequence-reads (FASTQ-format .fq/.fastq) or already assembled genomes (FASTA-format .fasta/.fna). Using sequence-reads saves up the assembly process but high-quality reads with a low error-rate are needed (e.g. Illumina-reads).

The amount of reads that will be used has to be set by the user when using sequence-reads. The minimum amount is 5000 reads for species classification and 500 reads for sub-type classification. The maximum number of reads is limited by the browser and is usually around ~8 million reads. Using more reads will lead to a increased runtime (xsec./1mio reads).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

XspecT-0.1.2.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

XspecT-0.1.2-py3-none-any.whl (1.8 MB view details)

Uploaded Python 3

File details

Details for the file XspecT-0.1.2.tar.gz.

File metadata

  • Download URL: XspecT-0.1.2.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for XspecT-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4990a24180e5bcfd7954cd71cdda17bb96a1a776dd0c5ba58a0c85bb2c1e1728
MD5 d5fbdc814239f011ae1d5dbabf38c2b8
BLAKE2b-256 735059e4b192cd98b426a25314b2ca0b3955cc6bb040d0257941d33e2c6a9aad

See more details on using hashes here.

File details

Details for the file XspecT-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: XspecT-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for XspecT-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c9c827dad74a8ae2794f566ca2c03c578966aeabb622e8ebeb5920a31c560bf5
MD5 a86bc7f1466b35274cb0f4d1bf66444e
BLAKE2b-256 ef48cca45fbcba1c4c1772386d548521280744359504d65ff47df9a28d387a9a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page