Easier way to run workflows, configurable across environments

These details have not been verified by PyPI

Project links

Project description

Welcome to Janis-Assistant

Janis is a workflow assistant designed to make the process of building and running workflows easier.

More specifically:

Janis core is a framework for specifying workflows, that can be transpiled to CWL and WDL.
Janis assistant runs manages an engine to these workflows and collects the results.

Quick start

pip3 install janis-pipelines

CWLTool

You can run a workflow in CWLTool with the following command line:

janis run --engine cwltool hello

To use CWLTool, you must have CWLTool in your path with either Docker or Node, see Engine support for more information.

Cromwell

Cromwell is the default engine, and can be ran with:

janis run --engine cromwell hello

To use Cromwell, you must have Java 1.8 available. See Engine support for more information.

CLI options:

run - Run a janis workflow (see the run parameters below)
watch - Watch an existing execution (folder or workflow ID)
abort - Issue an abort request to an existing execution
inputs - Generate an inputs file for a workflow
translate - Translate a workflow into CWL / WDL
metadata - Get the available metadata on an execution
version - Print the version of janis submodules.
spider - Print documentation for a tool (allows to trace problems with the Janistoolbox)

`run`

You can run a workflow with the run method, here's an example to run the hello world example:

janis run hello

View the help guide

# $ janis run -h

positional arguments:
  workflow              Run the workflow defined in this file or available
                        within the toolbox
  extra_inputs

optional arguments:
  -h, --help            show this help message and exit
  -i INPUTS, --inputs INPUTS
                        YAML or JSON inputs file to provide values for the
                        workflow (can specify multiple times)
  -o OUTPUT_DIR, --output-dir OUTPUT_DIR
                        This directory to copy outputs to. By default
                        intermediate results are within a janis/execution
                        subfolder (unless overriden by a template)
  -B, --background      Run the workflow engine in the background (or submit
                        to a cluster if your template supports it)
  --progress            Show the progress screen if running in the background
  --keep-intermediate-files
                        Do not remove execution directory on successful
                        complete
  --skip-file-check     Skip checking if files exist before the start of a
                        workflow.
  --allow-empty-container
                        Some tools you use may not include a container, this
                        would usually (and intentionally) cause an error.
                        Including this flag will disable this check, and empty
                        containers can be used.
  --development         Apply common settings (--keep-execution-dir + --mysql)
                        to support incremental development of a pipeline

input manipulation:
  -r RECIPE, --recipe RECIPE
                        Use a provided recipe from a provided template
  --max-cores MAX_CORES
                        maximum number of cores to use when generating
                        resource overrides
  --max-memory MAX_MEMORY
                        maximum GB of memory to use when generating resource
                        overrides

hints:
  --hint-captureType {targeted,exome,chromosome,30x,90x,300x}
  --hint-engine {cromwell}

workflow collection arguments:
  --toolbox            Skip looking through the search path, and only look in
                        the toolbox
  -n NAME, --name NAME  If you have multiple workflows in your file, you may
                        want to help Janis out to select the right workflow to
                        run
  --no-cache            Force re-download of workflow if remote

engine arguments:
  --engine {cwltool,cromwell}
                        Choose an engine to start
  --cromwell-url CROMWELL_URL
                        Location to Cromwell

filescheme arguments:
  -f {local,ssh}, --filescheme {local,ssh}
                        Choose the filescheme required to retrieve the output
                        files where your engine is located. By selecting SSH,
                        Janis will SCP the files using the --filescheme-ssh-
                        binding SSH shortcut.
  --filescheme-ssh-binding FILESCHEME_SSH_BINDING
                        Only valid if you've selected the ssh filescheme. (eg:
                        scp cluster:/path/to/output local/output/dir)

validation arguments:
  --validation-reference VALIDATION_REFERENCE
                        reference file for validation
  --validation-truth-vcf VALIDATION_TRUTH_VCF
                        truthVCF for validation
  --validation-intervals VALIDATION_INTERVALS
                        intervals to validate between
  --validation-fields VALIDATION_FIELDS [VALIDATION_FIELDS ...]
                        outputs from the workflow to validate

beta features:
  --mysql               BETA: Run MySQL for persistence with Cromwell

Configuration

It's possible to configure a number of attributes of janis_assistant. You can provide a YAML configuration file in two ways:

CLI: --config /path/to/config.yml
Environment variable JANIS_CONFIGPATH=/path/to/config.yml
Default: $(HOME)/.janis/janis.conf - will additionally look for a config here.

Configurations aren't currently cascaded, but the intention is they will.

Options

Defaults: janis_assistant/management/configuration.py

Config / DB directory: configDir: /path/to/configir/
- Second priority to environment variable: JANIS_CONFIGDIR
- Default: (HOME)/.janis/
- Database: {configDir}/janis.db - Janis global database
Execution directory: executionDir
- Second priority to environment variable: JANIS_EXCECUTIONDIR
- Default: (HOME)/janis/execution/
Search paths: searchPaths
- Will additionally add from environment variable: JANIS_SEARCHPATH
- Default: (HOME)/janis/

Engines

There are currently 2 engines that janis_assistant supports:

CWLTool
Cromwell

CWLTool (default)

Due to the way CWLTool provides metadata, support for CWLTool is very basic, and limited to submitting workflows and linking the outputs. Janis can manage CWLTool in the background, except if CWLTool is terminated (through some transient cluster error), Janis is unable to restart it.

Cromwell

Cromwell can be run in two modes:

Connect to an existing instance (well supported) - include the --cromwell-url argument with the port to allow the Janis assistant to correctly connect to this instance.
Run and manage it's own instance. When the task is started, the process_id of the started Cromwell instance is stored in the taskdb, when the task finishes execution, Janis stops this Cromwell instance. Janis can manage a MySQL (in fact MariaDB) instance with the --mysql flag for durability and to reduce memory overhead.

Both of these options provide reporting and progress tracking due to Cromwell's extensive metadata endpoint. The TaskID (6 hex characters) is included as a label on the workflow.

janis watch $tid

A screenshot of the running the example whole genome germline pipeline (for a targeted sample) can be found below. (All engines can support this through a generalised metadata semantic (TaskMetadata), Neither CWLTool or Toil support much polling of metadata).

Extra Cromwell comments:

The TaskID is bound as a label on GCP instances (as wid, allowing you to query this information).
Janis uses the development spec of WDL, requiring Cromwell-42 or higher.
If asking Janis to start its own Cromwell instance, it requires the jar to be exported as $cromwelljar.

Databases

This feature requires better documentation in the primary Janis documentation.

Some features of Cromwell require a database to use: call-caching, resumability for cluster failures and so on.

Previously, this has been managed through automatically spinning up a mysql instance with Docker / Singularity, however this has been unstable. Now, as Cromwell supports a file-based database, this is now the default.

No options -> file-based DB
--no-database -> No database is ran
--mysql -> Automatically provision and manage a mysql server (unchanged)
Configure an existing

Call caching has been enabled by default using the file based method, we strongly recommend downloading Cromwell >50 and using fingerprint, see call caching documentation for more information.

WARNING: fingerprint will become the default once Cromwell 50 has been released. This might break if you're using older versions of Cromwell.

Filesystem

There is a weak concept of a filesystem for where your workflow is executed. This tool is really only developed for using the LocalFileSystem.

Supported filesystems:

LocalFileScheme
SSHFileScheme (identifier, connectionstring) - I'd recommend creating an SSH shortcut to avoid persisting personal details in database. Janis uses the connection string like so: scp connectionstring:/path/to/output /local/persist/path

Databases

Janis stores a global SQLite database at {configDir}/janis.db of environments and task pointers (default: ~/.janis/janis.db). When a task is started, a database and workflow files are copied to your specified output directory.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.13.0

Jul 12, 2023

0.12.1

Jun 14, 2023

0.11.9

Nov 19, 2021

0.11.8

Jul 21, 2021

0.11.7

Jun 10, 2021

0.11.6

May 3, 2021

0.11.5

Mar 31, 2021

0.11.4

Jan 22, 2021

0.11.3

Jan 22, 2021

0.11.2

Jan 20, 2021

0.11.1

Jan 8, 2021

0.11.0

Dec 21, 2020

0.10.11

Nov 10, 2020

0.10.10

Nov 10, 2020

0.10.9

Nov 6, 2020

0.10.5

Sep 9, 2020

0.10.4

Sep 8, 2020

0.10.3

Sep 2, 2020

0.10.2

Aug 31, 2020

0.10.1

Aug 6, 2020

0.10.0

Jul 16, 2020

0.9.19

Jul 15, 2020

0.9.18

Jun 19, 2020

0.9.17

May 22, 2020

0.9.16

Apr 24, 2020

0.9.15

Apr 23, 2020

0.9.14

Apr 22, 2020

0.9.13

Mar 30, 2020

0.9.12

Mar 24, 2020

0.9.11

Mar 20, 2020

0.9.10

Mar 18, 2020

0.9.9

Mar 16, 2020

0.9.8

Feb 26, 2020

0.9.7

Jan 31, 2020

0.9.6

Jan 30, 2020

0.9.5

Jan 24, 2020

0.9.4

Jan 21, 2020

0.9.3

Jan 20, 2020

0.9.2

Jan 19, 2020

0.9.1

Jan 17, 2020

0.9.0

Jan 17, 2020

0.8.1

Dec 11, 2019

0.8.0

Dec 9, 2019

0.7.16

Dec 9, 2019

0.7.15

Dec 6, 2019

0.7.13

Nov 21, 2019

0.7.12

Nov 18, 2019

0.7.11

Nov 15, 2019

0.7.10

Nov 14, 2019

0.7.9

Nov 14, 2019

0.7.8

Nov 13, 2019

0.7.7

Nov 11, 2019

0.7.6

Nov 10, 2019

0.7.5

Nov 7, 2019

0.7.4

Nov 7, 2019

0.7.3

Nov 6, 2019

0.7.1

Oct 25, 2019

0.7.0

Oct 25, 2019

0.6.2

Oct 2, 2019

0.6.1

Sep 26, 2019

0.6.0

Sep 26, 2019

0.5.7

Aug 22, 2019

0.5.6

Aug 15, 2019

0.5.5

Aug 12, 2019

0.5.4

Aug 7, 2019

0.5.3

Aug 6, 2019

0.5.2

Aug 1, 2019

0.5.1

Aug 1, 2019

0.5.0

Jul 30, 2019

0.4.0

Jul 26, 2019

0.1.0

Jul 23, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

janis-pipelines.runner-0.13.0.tar.gz (2.4 MB view details)

Uploaded Jul 12, 2023 Source

Built Distribution

janis_pipelines.runner-0.13.0-py3-none-any.whl (262.4 kB view details)

Uploaded Jul 12, 2023 Python 3

File details

Details for the file janis-pipelines.runner-0.13.0.tar.gz.

File metadata

Download URL: janis-pipelines.runner-0.13.0.tar.gz
Upload date: Jul 12, 2023
Size: 2.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for janis-pipelines.runner-0.13.0.tar.gz
Algorithm	Hash digest
SHA256	`6f82c757f020b13fa2daf9911256770fb6b8779f6530cbcca8262648ef0f58c4`
MD5	`3b7e2f46105f4ed91f9affd912d50509`
BLAKE2b-256	`c728dfb7e6db4ddf68b8761345772d0b2455a339c075688ee3ee0a6d56aaf5d7`

See more details on using hashes here.

File details

Details for the file janis_pipelines.runner-0.13.0-py3-none-any.whl.

File metadata

Download URL: janis_pipelines.runner-0.13.0-py3-none-any.whl
Upload date: Jul 12, 2023
Size: 262.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for janis_pipelines.runner-0.13.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2ae3a08a08168b58c992e9357ca599af5b039bf53f24af92d1709f59b170f8df`
MD5	`982b9a7823a710f410eb008b6c284cb0`
BLAKE2b-256	`f82bdc99c4d092c192ce8bac01a6ef65e25abb782116c128dda3512f45a2dc3a`

See more details on using hashes here.

janis-pipelines.runner 0.13.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Welcome to Janis-Assistant

Quick start

CWLTool

Cromwell

CLI options:

`run`

Configuration

Options

Engines

CWLTool (default)

Cromwell

Databases

Filesystem

Databases

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes