Skip to main content

Management utility for YAML database of Galaxy tools

Project description

Galaxy Tool Database

gx-tool-db on the Python Package Index (PyPI)

Small package for managaging a YAML database of Galaxy tool runtime environment metadata.

This Python project can be installed from PyPI using pip.

$ python3 -m venv .venv
$ . .venv/bin/activate
$ pip install gx-tool-db

This will install the executable gx-tool-db.

$ gx-tool-db --help

This library and associated scripts are licensed under the MIT License.

Development

Run gx-tool-db from a clone of the repository and with local changes applied using pip install -e.

$ python3 -m venv .venv
$ . .venv/bin/activate
$ pip install -e .

Example Project

This project allows automated importing various metadata about tools and exporting various artifacts including reports. To see an example project setup around a database and how to use gx-tool-db to import data into such a database and export reports - checkout https://github.com/jmchilton/gx-tool-db-project.

In particular, checkout bootstrap_db.sh and export.sh for example calls with real data. Storing the database as a structured YAML file allows the resulting data to be stored pretty naturally in a Github project - checkout the latest generated database for that project in tools_metadata.yml.

Getting Started

Start by bootstrapping data from a few servers:

$ mkdir my_tool_db
$ cd my_tool_db
$ gx-tool-db import-server --server org
$ gx-tool-db import-server --server eu
$ gx-tool-db import-server --server test

Next we can use the bootstrapped data to dump information about the latest version of all tools across all servers or at individual servers. This data can be exported as standard CSV files or more typical Galaxy style tabular (“tsv”) data.

$ gx-tool-db export-tabular --all-coverage --output coverage_public_servers.tsv
$ gx-tool-db export-tabular --coverage org --coverage test --output coverage_public_servers.csv

Next lets start apply tool labels. Lets read a list of deprecated tool IDs from a file or URL using the import-label command.

$ gx-tool-db import-label https://gist.githubusercontent.com/jmchilton/651dad1289cb897cfaa92a86a39a184e/raw/65da6b11353732b550f9b1e0f9dc218a6bcef916/gistfile1.txt deprecated

One can also apply a label to all tool IDs from a workflow or a directory of workflows using the label-workflow-tools command.

$ git clone https://github.com/galaxyproject/iwc.git
$ gx-tool-db label-workflow-tools iwc/ iwc_required

The deprecated and iwc_required labels can now be used to build toolbox-related artifacts. The following command will create two Ephemeris/ansible-galaxy-tools install YAML files from main’s (usegalaxy.org) toolset. The first will include only tools required by IWC workflows and the second will contain main’s whole toolbox with the exclusion of deprecated tools.

$ gx-tool-db export-install-yaml main --require-label iwc_required
$ gx-tool-db export-install-yaml main --exclude-label deprecated

Tool panel views (https://docs.google.com/presentation/d/1qKhWhJYe3LmDd0sKaY247s4DxjjZdi807YV_4TqYfGA) can also be constructed from these tool labels.

The following command will produce a file (best_practices.yml) that will be a frozen version of usegalaxy.org tool panel containing only tools with the label iwc_required.

$ gx-tool-db export-panel-view best_practices main --require-label iwc_required

The following command will produce a file (best_practices.yml) that will be a frozen version of usegalaxy.org tool panel containing only tools with the label iwc_required.

Since Galaxy doesn’t know about these external labels, the panel is frozen and the above command needs to be re-run as new tools are labelled. Alternatively, when using --exclude-label main’s sections can have tools added to them and they will be assumed to be non-deprecated and will appear in the tool panel.

$ gx-tool-db export-panel-view best_practices main --exclude-label deprecated

This application provides some utilities for automatically applying these tool labels but manual curation is still important when grouping tools. This can be done in the YAML directly or using spreadsheet software.

Use --label with the export-tabular command shown above to include columns for specified labels (these labels don’t even need to exist ahead of time). The same spreadsheet can then be re-imported using the import-tabular and the same labels to read the data back into the structured gx-tool-db database file.

$ gx-tool-db export-tabular --all-coverage --label really_cool --label meh --output to_curate.tsv
$ gx-tool-db import-tabular to_curate.tsv --label really_cool --label meh

For these spreadsheet commands, the target spreadsheet can also be an Google Sheets ID for collobrative editing.

$ gx-tool-db export-tabular --all-coverage --label really_cool --label meh --output 'sheet:1N84CziEyW0Z109slrL33cuFt3Wpuu037zogkBMhk-C0'
$ gx-tool-db import-tabular 'sheet:1N84CziEyW0Z109slrL33cuFt3Wpuu037zogkBMhk-C0' --label really_cool --label meh

Finally, to assist in maual curation of the database tool runtime results can be stored in the database as well.

$ gx-tool-db import-tests https://raw.githubusercontent.com/almahmoud/anvil-misc/master/reports/anvil-production/tool-tests/gxy-auto-06-27-16-32-39-1/results.json anvil

Test data summaries can then be included as part export-tabular` to help curate tool labels - either all test data labels or specified ones.

$ gx-tool-db export-tabular --all-tests --label really_cool --label meh --output to_curate_all_the_tests.tsv
$ gx-tool-db export-tabular --tests anvil --label really_cool --label meh --output to_curate_only_anvil_tests.tsv

Metadata about how tools are used within Galaxy Training Network tutorials can be loaded as well.

$ git clone https://github.com/galaxyproject/training-material.git
$ gx-tool-db import-trainings training-material

Columns for these tutorials and topics referencing tools can be then included with export-tabular with the --training-topcis and --training-tutorials flags respectively.

History

0.4.0 (2022-02-16)

  • Fix up more arguments to use hyphens instead of underscores.

  • Add new command to add labels for all tools from a server (for bootstrapping flavors).

  • Readme fixes and improvements.

0.3.0 (2021-09-15)

  • Import and allow export of more data from the initial server requests.

0.2.0 (2021-09-15)

  • Various enhancements - version database, recursive import of metadata, add training metadata to model, etc..

0.1.0 (2021-09-13)

  • Initial version.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gx-tool-db-0.4.0.tar.gz (25.1 kB view details)

Uploaded Source

Built Distributions

gx_tool_db-0.4.0-py3.8.egg (23.5 kB view details)

Uploaded Source

gx_tool_db-0.4.0-py2.py3-none-any.whl (24.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file gx-tool-db-0.4.0.tar.gz.

File metadata

  • Download URL: gx-tool-db-0.4.0.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.2

File hashes

Hashes for gx-tool-db-0.4.0.tar.gz
Algorithm Hash digest
SHA256 411b24b22d82d35a73476b341c9128487ded47e9a76d93eea452173f873f86d9
MD5 0541678ebe9c144befb4727ece76ece8
BLAKE2b-256 2af432dd3fc50609d61bf03eb02f1f49df73f3d74856b499578b25b298ec7b65

See more details on using hashes here.

File details

Details for the file gx_tool_db-0.4.0-py3.8.egg.

File metadata

  • Download URL: gx_tool_db-0.4.0-py3.8.egg
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.2

File hashes

Hashes for gx_tool_db-0.4.0-py3.8.egg
Algorithm Hash digest
SHA256 d45183c39ce71631cd1430d92f2633d70aa9e6d784b9361989b323f5ea8fd64d
MD5 aaaea723b5e89845724ae0a25e69e5ea
BLAKE2b-256 ecbebc6e0d9ce74e084fdd4946a55451223581da799cb6dcc36b3cea46ddbd09

See more details on using hashes here.

File details

Details for the file gx_tool_db-0.4.0-py2.py3-none-any.whl.

File metadata

  • Download URL: gx_tool_db-0.4.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.2

File hashes

Hashes for gx_tool_db-0.4.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 10f8d7681d1ad4a447da1a63c9a57fafe2c764ee35b611f364310f09d7135dcc
MD5 a36fee1e93868569144ec8967f6bc815
BLAKE2b-256 30b0f557ab9d884c5627c510cf877ee6f09bdc7e9facfab3874ac3ea3d328da4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page