Skip to main content

Metagenomics toolkit

Project description

Build Status PyPi package Downloads

Metagenomics toolkit enables scientists to download all of the sample metadata for a given study or sequence to a single csv file.

Install metagenomics toolkit

pip install -U mg-toolkit

Usage

$ mg-toolkit -h
usage: mg-toolkit [-h] [-V] [-d]
                  {original_metadata,sequence_search,bulk_download} ...

Metagenomics toolkit
--------------------

positional arguments:
  {original_metadata,sequence_search,bulk_download}
    original_metadata   Download original metadata.
    sequence_search     Search non-redundant protein database using HMMER
    bulk_download       Download result files in bulks for an entire study.

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         print version information
  -d, --debug           print debugging information

Examples

Download metadata:

$ mg-toolkit original_metadata -a ERP001736

Search non-redundant protein database using HMMER and fetch metadata:

$ mg-toolkit sequence_search -seq test.fasta -db full evalue -incE 0.02

Databases:
- full - Full length sequences (default)
- all - All sequences
- partial - Partial sequences

How to bulk download result files for an entire study?

$ mg-toolkit bulk_download -h
usage: mg-toolkit bulk_download [-h] -a ACCESSION [-o OUTPUT_PATH]
                                  [-p {1.0,2.0,3.0,4.0,4.1}]
                                  [-g {sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu,taxonomic_analysis_lsu,stats,non_coding_rna}]

optional arguments:
  -h, --help            show this help message and exit
  -a ACCESSION, --accession ACCESSION
                        Provide the study/project accession of your interest,
                        e.g. ERP001736, SRP000319. The study must be publicly
                        available in MGnify.
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        Location of the output directory, where the
                        downloadable files are written to. DEFAULT: CWD
  -p {1.0,2.0,3.0,4.0,4.1}, --pipeline {1.0,2.0,3.0,4.0,4.1}
                        Specify the version of the pipeline you are interested
                        in. Lets say your study of interest has been analysed
                        with multiple version, but you are only interested in
                        a particular version then used this option to filter
                        down the results by the version you interested in.
                        DEFAULT: Downloads all versions
  -g {sequence_data,functional_annotations,taxonomic_annotations,taxonomic_annot_ssu,taxonomic_annot_lsu,stats,non_coding_rna}, --result_group {sequence_data,functional_annotations,taxonomic_annotations,taxonomic_annot_ssu,taxonomic_annot_lsu,stats,non_coding_rna}
                        Provide a single result group if needed. Supported
                        result groups are: [sequence_data (all version),
                        functional_annotations (all version),
                        taxonomic_annotations (1.0-3.0), taxonomic_annot_ssu
                        (>=4.0), taxonomic_annot_lsu (>=4.0), stats,
                        non_coding_rna (>=4.0) DEFAULT: Downloads all result
                        groups if not provided. (default: None).

How to download all files for a given study accession?

$ mg-toolkit -d bulk_download -a ERP009703

How to download results of a specific version for given study accession?

$ mg-toolkit -d bulk_download -a ERP009703 -v 4.0

How to download specific result file groups (e.g. functional annotations only) for given study accession?

$ mg-toolkit -d bulk_download -a ERP009703 -g functional_annotations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mg-toolkit-0.6.3.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

mg_toolkit-0.6.3-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file mg-toolkit-0.6.3.tar.gz.

File metadata

  • Download URL: mg-toolkit-0.6.3.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.1

File hashes

Hashes for mg-toolkit-0.6.3.tar.gz
Algorithm Hash digest
SHA256 fbe1d40bd6da46592905e2445fe6e711b789616720dc89858ae5b21b389dba7c
MD5 a424dd9e9e50525d137bd4fd23479bc1
BLAKE2b-256 5d662cff96e9f452d14a5e9e51a4712d2acb4a838cb8576375a194c58f4b434c

See more details on using hashes here.

File details

Details for the file mg_toolkit-0.6.3-py3-none-any.whl.

File metadata

  • Download URL: mg_toolkit-0.6.3-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.1

File hashes

Hashes for mg_toolkit-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d1720e4f693467be96ffd566ced33e2d668187535e4e020a311aea173a78bc21
MD5 883e8b41ff3c72776384ac328a912a43
BLAKE2b-256 ff903ab220b0bc8dcbfcfd5eb2b79a1311428c0a2a404bdd60be2b5a5006d191

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page