A tool for classifying metagenomic data
Project description
Tiara
Deep-learning-based approach for identification of eukaryotic sequences in the metagenomic data powered by PyTorch.
The sequences are classified in two stages:
- In the first stage, the sequences are classified to classes: archaea, bacteria, prokarya, eukarya, organelle and unknown.
- In the second stage, the sequences labeled as organelle in the first stage are classified to either mitochondria, plastid or unknown.
Requirements
Python >= 3.7
numpy, biopython, torch, skorch, tqdm, joblib, numba
Installation
More detailed installation instructions can be found here.
Using pip
Run pip install tiara
, preferably in a fresh environment.
Using setup.py
Latest stable release
- Download latest release from https://github.com/ibe-uw/tiara/releases.
- Unzip/untar the archive.
- Go to the directory.
- Run
python setup.py install
.
Latest developer version
git clone https://github.com/ibe-uw/tiara.git
cd tiara
python setup.py install
Testing the installation
After the installation, run tiara-test
to see if the installation was successful.
Usage
Basic usage:
tiara -i sample_input.fasta -o out.txt
The sequences in the fasta file should be at least 3000 bases long (default value). We do not recommend classify sequences that are shorter than 1000 base pairs.
It creates two files:
- out.txt, a tab-separated file with header
sequence id, first stage classification result, second stage classification result
. - log_out.txt, containing model parameters and classification summary.
Advanced:
tiara -i sample_input.fasta -o out.txt --tf mit pla pro -t 4 -p 0.65 0.60 --probabilities
In addition to creating the files above, it creates, in the folder where tiara
is run,
three files containing sequences from sample_input.fasta
classified as
mitochondria, plastid and prokarya (--tf mit pla pro
option).
The number of threads is set to 4 (-t 4
) and probability cutoffs
in the first and second stage of classification are set to 0.65 and 0.6, respectively.
The probabilities of belonging to individual classes are also written to
out.txt
, thanks to --probabilities
option.
For more usage examples, go here.
Citation
License
Tiara is released under an open-source MIT license
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file tiara-1.0.1.tar.gz
.
File metadata
- Download URL: tiara-1.0.1.tar.gz
- Upload date:
- Size: 102.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2ad90a8f4b09e0310e24999fdc7c921c43a3602f921d16095b4c654deced74f |
|
MD5 | 35d73812af6064deff9cc7ac94b4602e |
|
BLAKE2b-256 | a0a3aa6e3ead76459402c1f9449776250615c5178066603f52a92b165bae95ae |
File details
Details for the file tiara-1.0.1-py3.9.egg
.
File metadata
- Download URL: tiara-1.0.1-py3.9.egg
- Upload date:
- Size: 102.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.9.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.9.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6134ebe89eea213df68c2f3b6473d540c96d31d47b99e986d8dc69d620c37639 |
|
MD5 | e011e885d76e471be0b1510fbf136ac0 |
|
BLAKE2b-256 | 2228f0fc176eee2cf4c41a44deaa97c4b1ef11c0df9af77d648dfdb88a2817ef |
File details
Details for the file tiara-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: tiara-1.0.1-py3-none-any.whl
- Upload date:
- Size: 102.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6f922790d01fa31392c012ada8fc615e7c8559b8e57fce7d58849579c2415d57 |
|
MD5 | 339381c4464ee6e92eb5abaea5a21bc8 |
|
BLAKE2b-256 | 2e151668ec53a767308a45138accf8dcfe63a35fb65c477d2daefbaed104819c |