Map genes and genome to the Global Microbial Gene Catalog (GMGC)
Command line tool to query the Global Microbial Gene Catalog (GMGC).
GMGC-mapper runs on Python 3.6-3.8 and requires prodigal to be available for genome mode.
The easiest way to install GMGC-mapper is through bioconda, which will ensure
all dependencies (including
prodigal) are installed automatically:
conda install -c bioconda gmgc-mapper
GMGC-mapper is available from PyPI, so can be installed
pip install GMGC-mapper
Note that this does not install
prodigal (which is necessary for the
Install from source
Finally, especially if you are retrieving the cutting edge version from Github, you can install with the standard
python setup.py install
- Input is a genome sequence.
gmgc-mapper -i input.fasta -o output
- Input is DNA/protein gene sequences
gmgc-mapper --nt-genes genes.fna --aa-genes genes.faa -o output
The nucleotide input is optional (but should be used if available so that the quality of the hits can be refined):
gmgc-mapper --aa-genes genes.faa -o output
The output folder will contain
- Outputs of gene prediction (prodigal).
- Complete data table, listing all the hits in GMGC, per gene.
- Complete table, listing all the genome bins (MAGs) that are found in the results.
- Human readable summary.
For more details, read the docs. A description of the outputs is also written to output folder for convenience.
-i/--input: path to the input genome file(.fasta/.gz/.bz2).
-o/--output: Output directory (will be created if non-existent).
--nt-genes: path to the input DNA gene file(.fasta/.gz/.bz2).
--aa-genes: path to the input Protein gene file(.fasta/.gz/.bz2).
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.