Pyrodigal cli optimized for metagenomic data
Project description
README
Introduction
This library is a simple wrapper of pyrodigal, which is a cythonized implementation of prodigal that is orders of magnitudes faster.
Pyrodigal is mostly written for single genomes or FASTA files, so this tool was created to batch process metagenomic-scale datasets. Metagenomic data usually consists of large number of genome files for MAGs. Additionally, viral metagenomic datasets tend to store all single-scaffold viruses in a single file, which tends to be much larger than a typical single-genome FASTA file.
This tool parallelizes pyrodigal over large amounts of files (MAGs) or FASTA files that have a large number of scaffolds (viruses).
Installation
Install versioned releases
pip install metapyrodigal
Install from source
git clone https://github.com/cody-mar10/metapyrodigal.git
cd metapyrodigal
pip install .
Usage
This tool will overwrite the pyrodigal
binary, so you can use the metagenome-focused binary that I created.
The help page from pyrodigal -h
looks like this:
usage: pyrodigal [-h] (-i FILE [FILE ...] | -d DIR) [-o DIR] [-c INT] [--genes]
[--virus-mode]
Find ORFs from query genomes using pyrodigal v3.5.2, the cythonized prodigal API
options:
-h, --help show this help message and exit
-i FILE [FILE ...], --input FILE [FILE ...]
fasta file(s) of query genomes (can use unix wildcards)
-d DIR, --input-dir DIR
directory of fasta files to process
-o DIR, --outdir DIR output directory (default: CWD)
-c INT, --max-cpus INT
maximum number of threads to use (default: 1)
--genes use to also output the nucleotide genes .ffn file (default: False)
--virus-mode use pyrodigal-gv to activate the virus models (default: False)
-x STR, --extension STR
genome FASTA file extension if using -d/--input-dir (default: fna)
-i
and -d
are mutually exclusive but one of them must be provided.
The output files have the same basename as the input file. Protein FASTA files will have the extension .faa
, and nucleotide gene FASTA files will have the extension .ffn
. For example:
pyrodigal -i GENOME.fna
will output GENOME.faa
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for metapyrodigal-1.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 377cf6b03b01bf5ac8047e31201ed5921f5579a54b7ad9113452e86ca9630f03 |
|
MD5 | 0985043db755655d33d712f8939e78f3 |
|
BLAKE2b-256 | 61eb9b42fb2781acb99597433837ec9e488fe13d352bce093b199e1d55a1e93c |