Metagenomics toolkit.

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: Bio-Informatics

Project description

Tests

Metagenomics toolkit enables scientists to download all of the sample metadata for a given study or sequence to a single csv file.

Install metagenomics toolkit

Through pip

pip install -U mg-toolkit

Or using conda

conda install -c bioconda mg-toolkit

Usage

$ mg-toolkit -h
usage: mg-toolkit [-h] [-V] [-d]
                  {original_metadata,sequence_search,bulk_download} ...

Metagenomics toolkit
--------------------

positional arguments:
  {original_metadata,sequence_search,bulk_download}
    original_metadata   Download original metadata.
    sequence_search     Search non-redundant protein database using HMMER
    bulk_download       Download result files in bulks for an entire study.

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         print version information
  -d, --debug           print debugging information

Examples

Download metadata:

$ mg-toolkit original_metadata -a ERP001736

Search non-redundant protein database using HMMER and fetch metadata:

$ mg-toolkit sequence_search -seq test.fasta -out test.csv -db full evalue -incE 0.02

Databases:
- full - Full length sequences (default)
- all - All sequences
- partial - Partial sequences

How to bulk download result files for an entire study?

usage: mg-toolkit bulk_download [-h] -a ACCESSION [-o OUTPUT_PATH]
                                [-p {1.0,2.0,3.0,4.0,4.1,5.0}]
                                [-g {statistics,sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu_rrna,taxonomic_analysis_lsu_rrna,non-coding_rnas,taxonomic_analysis_itsonedb,taxonomic_analysis_unite,taxonomic_analysis_motupathways_and_systems}]

optional arguments:
-h, --help            show this help message and exit
-a ACCESSION, --accession ACCESSION
                        Provide the study/project accession of your interest, e.g. ERP001736, SRP000319. The study must be publicly available in MGnify.
-o OUTPUT_PATH, --output_path OUTPUT_PATH
                        Location of the output directory, where the downloadable files are written to.
                        DEFAULT: CWD
-p {1.0,2.0,3.0,4.0,4.1,5.0}, --pipeline {1.0,2.0,3.0,4.0,4.1,5.0}
                        Specify the version of the pipeline you are interested in.
                        Lets say your study of interest has been analysed with
                        multiple version, but you are only interested in a particular
                        version then used this option to filter down the results by
                        the version you interested in.
                        DEFAULT: Downloads all versions
-g {statistics,sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu_rrna,taxonomic_analysis_lsu_rrna,non-coding_rnas,taxonomic_analysis_itsonedb,taxonomic_analysis_unite,taxonomic_analysis_motupathways_and_systems}, --result_group {statistics,sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu_rrna,taxonomic_analysis_lsu_rrna,non-coding_rnas,taxonomic_analysis_itsonedb,taxonomic_analysis_unite,taxonomic_analysis_motupathways_and_systems}
                        Provide a single result group if needed.
                        Supported result groups are:
                        - statistics
                        - sequence_data (all versions)
                        - functional_analysis (all versions)
                        - taxonomic_analysis (1.0-3.0)
                        - taxonomic_analysis_ssu_rrna (>=4.0)
                        - taxonomic_analysis_lsu_rrna (>=4.0)
                        - non-coding_rnas (>=4.0)
                        - taxonomic_analysis_itsonedb (>= 5.0)
                        - taxonomic_analysis_unite (>= 5.0)
                        - taxonomic_analysis_motu  (>= 5.0)
                        - pathways_and_systems (>= 5.0)
                        DEFAULT: Downloads all result groups if not provided.
                        (default: None).

How to download all files for a given study accession?

$ mg-toolkit -d bulk_download -a ERP009703

How to download results of a specific version for given study accession?

$ mg-toolkit -d bulk_download -a ERP009703 -v 4.0

How to download specific result file groups (e.g. functional analysis only) for given study accession?

$ mg-toolkit -d bulk_download -a ERP009703 -g functional_analysis

The bulk uploader will store a .tsv file with all the metadata for each downloaded file.

Usage as a python package

⚠️ Liable to change ⚠️

Whilst mg_toolkit is designed as a command-line tool, it is a set of python modules with helper classes that could be useful in your own python scripts. These internal APIs and call signatures may change over time. See main() for default arguments.

Example

from mg_toolkit.metadata import OriginalMetadata
erp001736 = OriginalMetadata('ERP001736')
erp001736.fetch_metadata()

Development setup

Install the package in edit mode, and additional dev requirements (pre-commit hooks and version bumper).

pip install -e . -r requirements-dev.txt
pre-commit install

You can bump the version with e.g. bump2version patch.

Contributors

Thanks goes to these wonderful people (emoji key):

_{Ola Tarkowska}
💻📖

_{Maxim Scheremetjew}
💻📖

_{Martin Beracochea}
💻

_{Emil Hägglund}
💻

_{Sandy Rogers}
💻

This project follows the all-contributors specification. Contributions of any kind welcome!

Contact

If the documentation do not answer your questions, please contact us.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: Bio-Informatics

Release history Release notifications | RSS feed

This version

0.10.4

Mar 7, 2024

0.10.3

Feb 12, 2024

0.10.2

Feb 9, 2024

0.10.1

May 20, 2022

0.10.0

Mar 24, 2021

0.9.1

Nov 25, 2020

0.9.0

Nov 3, 2020

0.8.0

Nov 3, 2020

0.7.0

Sep 24, 2020

0.6.5

May 1, 2020

0.6.4

Nov 21, 2018

0.6.3

Sep 12, 2018

0.6.2

Sep 12, 2018

0.6.1

Aug 8, 2018

0.6.0

Aug 8, 2018

0.5.0

Jul 18, 2018

0.4.1

Jul 18, 2018

0.3.1

Jul 18, 2018

0.3.0

Jul 4, 2018

0.2.7

Jul 3, 2018

0.2.6

Jul 3, 2018

0.2.5

Jul 3, 2018

0.2.4

Jul 3, 2018

0.2.3

Jul 3, 2018

0.2.2

Jul 1, 2018

0.2.1

Jul 1, 2018

0.2.0

Jun 29, 2018

0.1.4

Apr 25, 2018

0.1.3

Apr 25, 2018

0.1.2

Apr 25, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mg-toolkit-0.10.4.tar.gz (20.4 kB view details)

Uploaded Mar 7, 2024 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mg_toolkit-0.10.4-py3-none-any.whl (22.1 kB view details)

Uploaded Mar 7, 2024 Python 3

mg_toolkit-0.10.4-py2.py3-none-any.whl (22.1 kB view details)

Uploaded Mar 7, 2024 Python 2Python 3

File details

Details for the file mg-toolkit-0.10.4.tar.gz.

File metadata

Download URL: mg-toolkit-0.10.4.tar.gz
Upload date: Mar 7, 2024
Size: 20.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for mg-toolkit-0.10.4.tar.gz
Algorithm	Hash digest
SHA256	`087042ccaac9601fecfdc2e121330b650426ec373ce5353fc32cd5337ef823a9`
MD5	`5209811c00fe78b0571f10b8ab16a449`
BLAKE2b-256	`736626f13e013c5987bb8e0d29cc23e0f94f926911085e4800e84d298d2f58a8`

See more details on using hashes here.

File details

Details for the file mg_toolkit-0.10.4-py3-none-any.whl.

File metadata

Download URL: mg_toolkit-0.10.4-py3-none-any.whl
Upload date: Mar 7, 2024
Size: 22.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for mg_toolkit-0.10.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`379e3c5a46e36d9d0ebce8a2c2ae91a528453b078a6e2365b5b2c9f26cfdd6a0`
MD5	`92bf0d4629880bcf3364eaf2493140c0`
BLAKE2b-256	`02eaab1b3f65a5f960ced3254a2278a4f90eb03dce5ba6439287e72c30e024f2`

See more details on using hashes here.

File details

Details for the file mg_toolkit-0.10.4-py2.py3-none-any.whl.

File metadata

Download URL: mg_toolkit-0.10.4-py2.py3-none-any.whl
Upload date: Mar 7, 2024
Size: 22.1 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for mg_toolkit-0.10.4-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`39abf5a7d59bccd5275717381c4d601e2a627e0e833ad7eb4fd8c4e3e2d4c80d`
MD5	`dc58a970782fd7ce02038786380d3dae`
BLAKE2b-256	`ae722f948f8f27030479d1573fdfe5c60a832a077a44e388d3c8b27629f07e26`

See more details on using hashes here.

mg-toolkit 0.10.4

Navigation

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Install metagenomics toolkit

Usage

Examples

Usage as a python package

Example

Development setup

Contributors

Contact

Project details

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes