ingestion of galaxy tool wrappers (.xml) and workflows (.ga) into the janis language.
Project description
Galaxy to Janis Translation
Galaxy2janis is a productivity tool which translates Galaxy tool wrappers and workflows into the Janis language.
It accepts either a Galaxy wrapper (.xml) or workflow (.ga) and will produce a Janis definition (.py).
This software is part the Portable Pipelines Project which produces technologies to make workflow execution and sharing easier.
Galaxy2janis is currently available in pre-release form.
Contributing: Please get in touch by raising an issue so we can communicate via email or zoom.
Bugs: Please submit any bugs by raising an issue to help improve the software!
This program may fail when parsing legacy galaxy tools, or those written in unforseen ways.
Contents
- Quickstart Guide
- Description
- Inputs
- Outputs
- Producing CWL WDL Nextflow
- Making Runnable
- Supported Features
Quickstart Guide
Galaxy2janis is available as a PyPI package. It requires python ≥ 3.10.
# create & activate environemnt
python3.10 -m venv venv
source venv/bin/activate
# install package
pip install galaxy2janis
# translate galaxy tool
galaxy2janis tool [PATH]
galaxy2janis tool sample_data/abricate/abricate.xml
# translate galaxy workflow
galaxy2janis workflow [PATH]
galaxy2janis workflow sample_data/assembly.ga
The sample_data folder contains the files above and can be used to test your installation.
Description
What does this program do?
This program was created to aid workflow migration. It was designed to be a productivity tool by helping the user port a workflow from one specification to another.
Given a galaxy tool wrapper or galaxy workflow, it will extract as much information as possible, and will create a similar definition in Janis. Once in Janis, janis translate can be used to output an equivalent CWL/WDL/Nextflow definition.
For tool translations, the main software requirement will be identified and translated to Janis. A container will also be identified which can run the output Janis tool.
For workflow translations, the workflow itself will be translated to a Janis definition, alongside each tool used in the workflow.
What does this program not do?
Galaxy2janis is a productivity tool. It does not provide runnable translations.
It aims to produce a human readable output, and to match the structure of the ingested workflow. Users are expected to make some manual edits to finalise the workflow. See the Making Runnable Section for details.
Inputs
Tool translation
usage: galaxy2janis tool [OPTIONS] infile.xml
positional arguments:
infile path to tool.xml file to parse.
options:
-h, --help show this help message and exit
-o OUTDIR, --outdir OUTDIR
output folder to place translation
A local copy of the galaxy tool wrapper is needed. To download a tool wrapper:
- Select the tool in galaxy
- View its toolshed entry (top-left caret dropdown, 'See in Tool Shed')
- Download the wrapper as a zip file (repository actions -> 'Download as a zip file')
The unzipped file is the wrapper for that galaxy tool.
Once the wrapper has been obtained, the path to the specific tool to translate must be specified. For example, if you downloaded the abricate tool you may something similar to this structure:
abricate/
├── abricate.xml
├── macros.xml
└── test-data
├── Acetobacter.fna
├── MRSA0252.fna
└── output_db-card.txt
To translate abricate.xml:
galaxy2janis tool abricate/abricate.xml
Workflow Translation
usage: galaxy2janis workflow [OPTIONS] infile.ga
positional arguments:
infile path to workflow.ga file to parse.
options:
-h, --help show this help message and exit
-o OUTDIR, --outdir OUTDIR
output folder to place translation
A local copy of the galaxy workflow is needed. There are two methods to download a workflow:
- Download from workflow editor
- Download from Galaxy Training Network (GTN)
These will download a galaxy workflow file in .ga format.
To translate the workflow:
galaxy2janis workflow downloaded_workflow.ga
Each tool used in the workflow will be downloaded and translated automatically during the process.
Outputs
Tool translations produce a single Janis tool definition for in the input galaxy wrapper.
Workflow translations produce an output folder containing multiple files. Workflows need multiple entities such as tool definitions, the main workflow file, scripts, and a place to provide input values to run the workflow. The current output structure is as follows:
[translated_workflow]/
├── inputs.yaml # input values
├── logs
├── subworkflows
├── tools # tool definitions
│ ├── scripts # tool scripts
│ ├── untranslated # untranslated tool logic (galaxy)
│ └── wrappers # translated tool wrappers (galaxy)
└── workflow.py # main workflow file
Producing CWL WDL Nextflow
Janis Translate
Galaxy -> Janis -> CWL/WDL/Nextflow
This program ingests Galaxy definitions to produce Janis definitions.
Janis' inbuilt translate functionality can subsequently output to the languages seen above.
For example, translating the abriate tool from Galaxy to CWL:
# galaxy -> janis
galaxy2janis tool abricate/abricate.xml (produces abricate.py)
# janis -> CWL
janis translate abricate.py > abricate.cwl
Making Runnable
It is the responsibility of the user to make final edits & bring the workflow to a runnable state.
This tool is designed to increase productivity when migrating workflows; as such, the outputs it produces favour readability over completeness.
To aid users in this process, some hints are supplied and sources files are retained.
Hints
Some basic hints are provided to help users check the output. This information helps the user confirm everything looks correct, and make edits when it isn't quite right.
quast step in translated workflow:
A galaxy workflow was translated to Janis using galaxy2janis. A step within the workflow uses quast, which we see reflected in our output workflow.py file. The actual quast tool being used in the step above is a Janis definition and will appear in the tools/ output directory.
#UNKNOWN1=w.unicycler.outAssembly, # (CONNECTION)
When a tool input value appears commented out, the program was unable to link it to a software input. This may happen because the galaxy wrapper modifies the galaxy input before it is wired to an actual software input. In these cases, the user would need to either identify which tool input it maps to, or if none exists, open the translated quast tool and create one.
Source Files
These can be viewed to gain more context on what a source galaxy tool was intended to do. The main software command will have been translated into a Janis tool, but some details may have been left out in the process.
tools/untranslated
Contains untranslated galaxy tool logic. Galaxy tool wrappers may perform multiple tasks when executed. The main software tool being wrapped will execute, but some preprocessing or postprocessing steps may also be performed. A common structure is as follows:
- Preprocessing (symlinks / making directories / creating a genome index)
- Main software requirement (actual tool execution)
- Postprocessing (index or sort output / create additional output files / summaries)
Galaxy2janis translates the main software requirement into a Janis definition. Preprocessing and postprocessing logic are placed into the tools/untranslated folder as a reference, so the user can see what has been ignored.
tools/wrappers
Contains galaxy tools which were translated. They are the 'source files' which we used to create Janis tool definitions while the workflow was being parsed. Can be used as reference when tool translations weren't good quality.
Galaxy wrappers have a distinct style, so see the galaxy tool xml documentation for details.
Supported Features
This project is in active development. Many features are planned, and will be released over time.
unsupported
Features
- <command>
#def #set #if #forcheetah logic - <command>
Rscript -e(inline Rscripts) - <param>
type="color"params - xml features seen in 1% of tools
Wrappers
embosssuite of tool wrappers known to fail due to legacy features.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file galaxy2janis-0.1.1.tar.gz.
File metadata
- Download URL: galaxy2janis-0.1.1.tar.gz
- Upload date:
- Size: 255.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f997288b880225cfc56dfb5e0b1b1ff2643616d814b8d7323051d1893226c28
|
|
| MD5 |
95c949ea21e2834666983a9c317bc381
|
|
| BLAKE2b-256 |
7eeee3dd4655bb6cebfa8238973432717c7dd82e01f9424485f9763ae1be9d71
|
File details
Details for the file galaxy2janis-0.1.1-py3-none-any.whl.
File metadata
- Download URL: galaxy2janis-0.1.1-py3-none-any.whl
- Upload date:
- Size: 322.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
092cb63e5e01e97a577d8613bd6e7b8b4b713be57f28e9ce00b8843ff7ca24d1
|
|
| MD5 |
e681813bb007d1edc260b6392f50cab0
|
|
| BLAKE2b-256 |
1484d8dd71cbe47ff27124d2d2bf492767f93c926dc796b3fed657a9c632d1c2
|