Skip to main content

Collection of scripts and tools for MGnify pipelines

Project description

mgnify-pipelines-toolkit

This Python package contains a collection of scripts and tools for including in MGnify pipelines. Scripts stored here are mainly for:

  • One-off production scripts that perform specific tasks in pipelines
  • Scripts that have few dependencies
  • Scripts that don't have existing containers built to run them
  • Scripts for which building an entire container would be too bulky of a solution to deploy in pipelines

This package can be built and uploaded to PyPi, to be installed using pip. The package bundles scripts and makes them executable from the command-line when this package is installed.

Soon: this repository will be made available on bioconda for even easier integration in nextflow/nf-core pipelines.

How to install

Currently this package is only available on TestPyPi and is installed like this:

pip install -i https://test.pypi.org/simple/ --no-deps mgnify-pipelines-toolkit

You should then be able to run the packages from the command-line. For example to run the get_subunits.py script:

get_subunits -i ${easel_coords} -n ${meta.id}

Building and uploading to PyPi

This command this build the package:

python3 -m build

Then this command will upload it to Test-PyPi (you will need to generate an API token)

python3 -m twine upload --repository testpypi dist/mgnify_pipelines_toolkit-0.0.x*

To upload it to actual PyPi:

python3 -m twine upload dist/*

Adding a new script to the package

New script requirements

There are a few requirements for your script:

  • It needs to have a named main function of some kind. See mgnify_pipelines_toolkit/v6/get_subunits.py and the main() function for an example
  • Because this package is meant to be run from the command-line, make sure your script can easily pass arguments using tools like argparse or click
  • A small amount of dependencies. This requirement is subjective, but for example if your script only requires a handful of basic packages like Biopython, numpy, pandas, etc., then it's fine. However if the script has a more extensive list of dependencies, a container is probably a better fit.

How to add a new script

To add a new Python script, first copy it over to the mgnify_pipelines_toolkit directory in this repository, specifically to the subdirectory that makes the most sense. If none of the subdirectories make sense for your script, create a new one. If your script doesn't have a main() type function yet, write one.

Then, open pyproject.toml as you will need to add some bits. First, any dependencies required (include the version) to the dependencies field.

Then, scroll down to the [project.scripts] line. Here, you will create an alias command for running your script from the command-line. In the example line:

get_subunits = "mgnify_pipelines_toolkit.v6.get_subunits:main"

  • get_subunits is the alias
  • mgnify_pipelines_toolkit.v6.get_subunits will link the alias to the script with the path mgnify_pipelines_toolkit/v6/get_subunits.py
  • :main will specifically call the function named main() when the alias is run.

When you have setup this command, executing get_subunits on the command-line will be the equivalent of doing:

from mgnify_pipelines_toolkit.v6.get_subunits import main; main()

Finally, you will need to bump up the version in the version line. How/when we bump versions is to be determined.

At the moment, these should be the only steps required to setup your script in this package (which is subject to change).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mgnify_pipelines_toolkit-0.0.5.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

mgnify_pipelines_toolkit-0.0.5-py3-none-any.whl (31.4 kB view details)

Uploaded Python 3

File details

Details for the file mgnify_pipelines_toolkit-0.0.5.tar.gz.

File metadata

File hashes

Hashes for mgnify_pipelines_toolkit-0.0.5.tar.gz
Algorithm Hash digest
SHA256 8c4593be90d8f4bf07d8b8998ba3fd9c0bfb159edda2bcf3fb741a81c6554c3f
MD5 29d9bfd960821b8ffa17e209c2f5d783
BLAKE2b-256 fcca0178078172621d0d6d09c5201975fd643bc1d3101a7deccbc1fc2e1c9db5

See more details on using hashes here.

File details

Details for the file mgnify_pipelines_toolkit-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for mgnify_pipelines_toolkit-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d6274313f16ba07c2b98c41947ba88d67df9023bfd90ba984bf667814ddfea8c
MD5 a67206804ffdcfe16138d4bf93b43b4b
BLAKE2b-256 57408ebc6853592cfa6e3eb614e2c6ce2d86e617a13f5ade416975fabb528fe8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page