viralVerify rewrite/refactor for PyPI packaging and distribution, maintainability and clarity.
NOTE: BLAST+ search option has been removed. Results output table will be different than the original viralVerify. Naive Bayes classifier training script has not been ported yet.
It’s recommended that you use Conda to install the required software (Prodigal and HMMer3) and Python dependencies.
$ conda env create -f environment.yml
Pip
If you have Prodigal and HMMer3 installed in your $PATH, and Python 3.6 or greater, you can use pip to install viral_verify:
$ pip install viral_verify
Usage
$ viral_verify --help
Usage: viral_verify [OPTIONS]
HMM and Naive Bayes classification of contig sequences as either viral,
plasmid or chromosomal.
Requires Prodigal for gene prediction and hmmsearch from HMMer3 for
searching for Pfam HMM profiles.
Options:
-i, --input-fasta PATH Input fasta file [required]
-o, --outdir PATH Output directory [required]
-H, --hmm-db PATH Path to Pfam-A HMM database [required]
-t, --threads INTEGER Number of threads (default=16)
-p, --output-plasmids-separately
Output predicted plasmids separately?
--prefix TEXT Output file prefix (default: None)
--uncertainty-threshold FLOAT Uncertainty threshold (Natural log
probability) (default=3.0)
--naive-bayes-classifier-table PATH
Table of protein domain frequencies to use
for Naive Bayes classification (default="/ho
me/pkruczkiewicz/repos/viral_verify/viral_ve
rify/data/classifier_table.txt")
-v, --verbose Logging verbosity
--version Show the version and exit.
--help Show this message and exit.
Credits
The original source code, design and conception can be found at viralVerify. This is merely a rewrite for easier packaging via PyPI, adding some CI with Travis-CI and organizing the code for maintainability and clarity.