Metagenomics toolkit.
Project description
Metagenomics toolkit enables scientists to download all of the sample metadata for a given study or sequence to a single csv file.
Install metagenomics toolkit
pip install -U mg-toolkit
Usage
$ mg-toolkit -h
usage: mg-toolkit [-h] [-V] [-d]
{original_metadata,sequence_search,bulk_download} ...
Metagenomics toolkit
--------------------
positional arguments:
{original_metadata,sequence_search,bulk_download}
original_metadata Download original metadata.
sequence_search Search non-redundant protein database using HMMER
bulk_download Download result files in bulks for an entire study.
optional arguments:
-h, --help show this help message and exit
-V, --version print version information
-d, --debug print debugging information
Examples
Download metadata:
$ mg-toolkit original_metadata -a ERP001736
Search non-redundant protein database using HMMER and fetch metadata:
$ mg-toolkit sequence_search -seq test.fasta -db full evalue -incE 0.02
Databases:
- full - Full length sequences (default)
- all - All sequences
- partial - Partial sequences
How to bulk download result files for an entire study?
usage: mg-toolkit bulk_download [-h] -a ACCESSION [-o OUTPUT_PATH]
[-p {1.0,2.0,3.0,4.0,4.1,5.0}]
[-g {statistics,sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu_rrna,taxonomic_analysis_lsu_rrna,non-coding_rnas,taxonomic_analysis_itsonedb,taxonomic_analysis_unite,taxonomic_analysis_motupathways_and_systems}]
optional arguments:
-h, --help show this help message and exit
-a ACCESSION, --accession ACCESSION
Provide the study/project accession of your interest, e.g. ERP001736, SRP000319. The study must be publicly available in MGnify.
-o OUTPUT_PATH, --output_path OUTPUT_PATH
Location of the output directory, where the downloadable files are written to.
DEFAULT: CWD
-p {1.0,2.0,3.0,4.0,4.1,5.0}, --pipeline {1.0,2.0,3.0,4.0,4.1,5.0}
Specify the version of the pipeline you are interested in.
Lets say your study of interest has been analysed with
multiple version, but you are only interested in a particular
version then used this option to filter down the results by
the version you interested in.
DEFAULT: Downloads all versions
-g {statistics,sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu_rrna,taxonomic_analysis_lsu_rrna,non-coding_rnas,taxonomic_analysis_itsonedb,taxonomic_analysis_unite,taxonomic_analysis_motupathways_and_systems}, --result_group {statistics,sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu_rrna,taxonomic_analysis_lsu_rrna,non-coding_rnas,taxonomic_analysis_itsonedb,taxonomic_analysis_unite,taxonomic_analysis_motupathways_and_systems}
Provide a single result group if needed.
Supported result groups are:
- statistics
- sequence_data (all versions)
- functional_analysis (all versions)
- taxonomic_analysis (1.0-3.0)
- taxonomic_analysis_ssu_rrna (>=4.0)
- taxonomic_analysis_lsu_rrna (>=4.0)
- non-coding_rnas (>=4.0)
- taxonomic_analysis_itsonedb (>= 5.0)
- taxonomic_analysis_unite (>= 5.0)
- taxonomic_analysis_motu (>= 5.0)
- pathways_and_systems (>= 5.0)
DEFAULT: Downloads all result groups if not provided.
(default: None).
How to download all files for a given study accession?
$ mg-toolkit -d bulk_download -a ERP009703
How to download results of a specific version for given study accession?
$ mg-toolkit -d bulk_download -a ERP009703 -v 4.0
How to download specific result file groups (e.g. functional analysis only) for given study accession?
$ mg-toolkit -d bulk_download -a ERP009703 -g functional_analysis
Contributors
Thanks goes to these wonderful people (emoji key):
Ola Tarkowska 💻📖 |
Maxim Scheremetjew 💻📖 |
Martin Beracochea 💻 |
This project follows the all-contributors specification. Contributions of any kind welcome!
Contact
If the documentation do not answer your questions, please contact us.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mg-toolkit-0.8.0.tar.gz
.
File metadata
- Download URL: mg-toolkit-0.8.0.tar.gz
- Upload date:
- Size: 17.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.6.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 198c56dc5296775d0e0110873ea88ebd2bda6723570ae5e21bb34c05bb1055e7 |
|
MD5 | 663c931469abceb0c05f257d13f152f6 |
|
BLAKE2b-256 | 28113193fed069341811a8eda94a8ad58af1c76c5daa7d5eb704e67874982051 |
File details
Details for the file mg_toolkit-0.8.0-py3-none-any.whl
.
File metadata
- Download URL: mg_toolkit-0.8.0-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.6.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69ef3fb695d039d702a3df6d2d04af603774f36b5e7666a56b9983a461b5d292 |
|
MD5 | 4f033fde5163cf16b6ffd4a69b77a5d6 |
|
BLAKE2b-256 | 433c4a2d40e66f62ad7dfc118972fe01e54a0a991d369bfaef5ca8b6be4eae0b |