Skip to main content

A tool for classifying metagenomic data

Project description

Tiara

Deep-learning-based approach for identification of eukaryotic sequences in the metagenomic data powered by PyTorch.

The sequences are classified in two stages:

  • In the first stage, the sequences are classified to classes: archaea, bacteria, prokarya, eukarya, organelle and unknown.
  • In the second stage, the sequences labeled as organelle in the first stage are classified to either mitochondria, plastid or unknown.

Requirements

  • Python >= 3.7
  • numpy, biopython, torch, skorch, tqdm, joblib, numba

Installation

More detailed installation instructions can be found here.

Using pip

Run pip install tiara, preferably in a fresh environment.

Using setup.py

Latest stable release
Latest developer version
git clone https://github.com/ibe-uw/tiara.git
cd tiara
python setup.py install

Testing the installation

After the installation, run tiara-test to see if the installation was successful.

Usage

Basic usage:

tiara -i sample_input.fasta -o out.txt

The sequences in the fasta file should be at least 3000 bases long (default value). We do not recommend classify sequences that are shorter than 1000 base pairs.

It creates two files:

  • out.txt, a tab-separated file with header sequence id, first stage classification result, second stage classification result.
  • log_out.txt, containing model parameters and classification summary.

Advanced:

tiara -i sample_input.fasta -o out.txt --tf mit pla pro -t 4 -p 0.65 0.60 --probabilities

In addition to creating the files above, it creates, in the folder where tiara is run, three files containing sequences from sample_input.fasta classified as mitochondria, plastid and prokarya (--tf mit pla pro option).

The number of threads is set to 4 (-t 4) and probability cutoffs in the first and second stage of classification are set to 0.65 and 0.6, respectively.

The probabilities of belonging to individual classes are also written to out.txt, thanks to --probabilities option.

For more usage examples, go here.

Citation

License

Tiara is released under an open-source MIT license

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tiara-1.0.1.tar.gz (102.4 MB view details)

Uploaded Source

Built Distributions

tiara-1.0.1-py3.9.egg (102.7 MB view details)

Uploaded Source

tiara-1.0.1-py3-none-any.whl (102.7 MB view details)

Uploaded Python 3

File details

Details for the file tiara-1.0.1.tar.gz.

File metadata

  • Download URL: tiara-1.0.1.tar.gz
  • Upload date:
  • Size: 102.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.5

File hashes

Hashes for tiara-1.0.1.tar.gz
Algorithm Hash digest
SHA256 f2ad90a8f4b09e0310e24999fdc7c921c43a3602f921d16095b4c654deced74f
MD5 35d73812af6064deff9cc7ac94b4602e
BLAKE2b-256 a0a3aa6e3ead76459402c1f9449776250615c5178066603f52a92b165bae95ae

See more details on using hashes here.

File details

Details for the file tiara-1.0.1-py3.9.egg.

File metadata

  • Download URL: tiara-1.0.1-py3.9.egg
  • Upload date:
  • Size: 102.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.9.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.9.2

File hashes

Hashes for tiara-1.0.1-py3.9.egg
Algorithm Hash digest
SHA256 6134ebe89eea213df68c2f3b6473d540c96d31d47b99e986d8dc69d620c37639
MD5 e011e885d76e471be0b1510fbf136ac0
BLAKE2b-256 2228f0fc176eee2cf4c41a44deaa97c4b1ef11c0df9af77d648dfdb88a2817ef

See more details on using hashes here.

File details

Details for the file tiara-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: tiara-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 102.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.5

File hashes

Hashes for tiara-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6f922790d01fa31392c012ada8fc615e7c8559b8e57fce7d58849579c2415d57
MD5 339381c4464ee6e92eb5abaea5a21bc8
BLAKE2b-256 2e151668ec53a767308a45138accf8dcfe63a35fb65c477d2daefbaed104819c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page