Metagenomics toolkit
Project description
[![Build Status](https://travis-ci.org/EBI-Metagenomics/emg-toolkit.svg?branch=master)](https://travis-ci.org/EBI-Metagenomics/emg-toolkit) [![PyPi package](https://badge.fury.io/py/mg-toolkit.svg)](https://badge.fury.io/py/mg-toolkit) [![Downloads](http://pepy.tech/badge/mg-toolkit)](http://pepy.tech/project/mg-toolkit)
Metagenomics toolkit enables scientists to download all of the sample
metadata for a given study or sequence to a single csv file.
Install metagenomics toolkit
============================
pip install -U mg-toolkit
Usage
=====
$ mg-toolkit -h
usage: mg-toolkit [-h] [-V] [-d]
{original_metadata,sequence_search,bulk_download} ...
Metagenomics toolkit
--------------------
positional arguments:
{original_metadata,sequence_search,bulk_download}
original_metadata Download original metadata.
sequence_search Search non-redundant protein database using HMMER
bulk_download Download result files in bulks for an entire study.
optional arguments:
-h, --help show this help message and exit
-V, --version print version information
-d, --debug print debugging information
Examples
========
Download metadata:
$ mg-toolkit original_metadata -a ERP001736
Search non-redundant protein database using HMMER and fetch metadata:
$ mg-toolkit sequence_search -seq test.fasta -db full evalue -incE 0.02
Databases:
- full - Full length sequences (default)
- all - All sequences
- partial - Partial sequences
How to bulk download result files for an entire study?
$ mg-toolkit bulk_download -h
usage: mg-toolkit bulk_download [-h] -a ACCESSION [-o OUTPUT_PATH]
[-p {1.0,2.0,3.0,4.0,4.1}]
[-g {sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu,taxonomic_analysis_lsu,stats,non_coding_rna}]
optional arguments:
-h, --help show this help message and exit
-a ACCESSION, --accession ACCESSION
Provide the study/project accession of your interest,
e.g. ERP001736, SRP000319. The study must be publicly
available in MGnify.
-o OUTPUT_PATH, --output_path OUTPUT_PATH
Location of the output directory, where the
downloadable files are written to. DEFAULT: CWD
-p {1.0,2.0,3.0,4.0,4.1}, --pipeline {1.0,2.0,3.0,4.0,4.1}
Specify the version of the pipeline you are interested
in. Lets say your study of interest has been analysed
with multiple version, but you are only interested in
a particular version then used this option to filter
down the results by the version you interested in.
DEFAULT: Downloads all versions
-g {sequence_data,functional_annotations,taxonomic_annotations,taxonomic_annot_ssu,taxonomic_annot_lsu,stats,non_coding_rna}, --result_group {sequence_data,functional_annotations,taxonomic_annotations,taxonomic_annot_ssu,taxonomic_annot_lsu,stats,non_coding_rna}
Provide a single result group if needed. Supported
result groups are: [sequence_data (all version),
functional_annotations (all version),
taxonomic_annotations (1.0-3.0), taxonomic_annot_ssu
(>=4.0), taxonomic_annot_lsu (>=4.0), stats,
non_coding_rna (>=4.0) DEFAULT: Downloads all result
groups if not provided. (default: None).
How to download all files for a given study accession?
$ mg-toolkit -d bulk_download -a ERP009703
How to download results of a specific version for given study accession?
$ mg-toolkit -d bulk_download -a ERP009703 -v 4.0
How to download specific result file groups (e.g. functional annotations only) for given study accession?
$ mg-toolkit -d bulk_download -a ERP009703 -g functional_annotations
Metagenomics toolkit enables scientists to download all of the sample
metadata for a given study or sequence to a single csv file.
Install metagenomics toolkit
============================
pip install -U mg-toolkit
Usage
=====
$ mg-toolkit -h
usage: mg-toolkit [-h] [-V] [-d]
{original_metadata,sequence_search,bulk_download} ...
Metagenomics toolkit
--------------------
positional arguments:
{original_metadata,sequence_search,bulk_download}
original_metadata Download original metadata.
sequence_search Search non-redundant protein database using HMMER
bulk_download Download result files in bulks for an entire study.
optional arguments:
-h, --help show this help message and exit
-V, --version print version information
-d, --debug print debugging information
Examples
========
Download metadata:
$ mg-toolkit original_metadata -a ERP001736
Search non-redundant protein database using HMMER and fetch metadata:
$ mg-toolkit sequence_search -seq test.fasta -db full evalue -incE 0.02
Databases:
- full - Full length sequences (default)
- all - All sequences
- partial - Partial sequences
How to bulk download result files for an entire study?
$ mg-toolkit bulk_download -h
usage: mg-toolkit bulk_download [-h] -a ACCESSION [-o OUTPUT_PATH]
[-p {1.0,2.0,3.0,4.0,4.1}]
[-g {sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu,taxonomic_analysis_lsu,stats,non_coding_rna}]
optional arguments:
-h, --help show this help message and exit
-a ACCESSION, --accession ACCESSION
Provide the study/project accession of your interest,
e.g. ERP001736, SRP000319. The study must be publicly
available in MGnify.
-o OUTPUT_PATH, --output_path OUTPUT_PATH
Location of the output directory, where the
downloadable files are written to. DEFAULT: CWD
-p {1.0,2.0,3.0,4.0,4.1}, --pipeline {1.0,2.0,3.0,4.0,4.1}
Specify the version of the pipeline you are interested
in. Lets say your study of interest has been analysed
with multiple version, but you are only interested in
a particular version then used this option to filter
down the results by the version you interested in.
DEFAULT: Downloads all versions
-g {sequence_data,functional_annotations,taxonomic_annotations,taxonomic_annot_ssu,taxonomic_annot_lsu,stats,non_coding_rna}, --result_group {sequence_data,functional_annotations,taxonomic_annotations,taxonomic_annot_ssu,taxonomic_annot_lsu,stats,non_coding_rna}
Provide a single result group if needed. Supported
result groups are: [sequence_data (all version),
functional_annotations (all version),
taxonomic_annotations (1.0-3.0), taxonomic_annot_ssu
(>=4.0), taxonomic_annot_lsu (>=4.0), stats,
non_coding_rna (>=4.0) DEFAULT: Downloads all result
groups if not provided. (default: None).
How to download all files for a given study accession?
$ mg-toolkit -d bulk_download -a ERP009703
How to download results of a specific version for given study accession?
$ mg-toolkit -d bulk_download -a ERP009703 -v 4.0
How to download specific result file groups (e.g. functional annotations only) for given study accession?
$ mg-toolkit -d bulk_download -a ERP009703 -g functional_annotations
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mg-toolkit-0.6.2.tar.gz
(13.5 kB
view details)
File details
Details for the file mg-toolkit-0.6.2.tar.gz
.
File metadata
- Download URL: mg-toolkit-0.6.2.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Python-urllib/3.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd0903e33a37f4117ac4e62563988c35d55ecf2ef0764311ca0a8c161ee53e4e |
|
MD5 | 4b80f6ceb8a98e70ce294932901f5001 |
|
BLAKE2b-256 | 0dd9943c7c688c4e62b63fabad78af5018bea4d8d8df32b41b58ca8aae2f9436 |