Skip to main content

A software to extract and analyze the structure and associated metadata from a Nextflow workflow.

Project description

BioFlow-Insight

License: GPL v3 Version 2.0.9

TODO: update this readme

Description

BioFlow-Insight is a Python-based open-source command-line tool designed to automatically analyse Nextflow workflow code, gathering useful information, particularly in the form of visual graphs that illustrate the workflow's structure and its various steps. Additionally, it is capable of detecting certain programming errors and generates a RO-Crate JSON-LD file that describes the workflow. It also generates a json describing the workflow's srtucture in a simplified metro-map form which can be visualised with MetroFlow.

BioFlow-Insight is easily installable as a CLI (see here). It is also freely accessible as a free web service. For more information and to start using BioFlow-Insight, visit here (https://bioflow-insight.pasteur.cloud/).

Table of Contents

Installation

Installing via pip

BioFlow-Insight is easily installable as a CLI.

To install it using pip, use the following command :

pip install bioflow-insight

Using from source

To access its source code, simply clone its GitLab repository. BioFlow-Insight is developed using Python 3

BioFlow-Insight's dependencies are given in the requirements.txt file.

Note : To install graphviz, in linux you might need to execute this command sudo apt install graphviz

Usage

BioFlow-Insight is a Python-based open-source command-line tool designed to automatically analyse Nextflow workflow code, gathering useful information, particularly in the form of visual graphs that illustrate the workflow's structure and its various steps. Additionally, it is capable of detecting certain programming errors and generates a RO-Crate JSON-LD file that describes the workflow. It also generates a json describing the workflow's srtucture in a simplified metro-map form which can be visualised with MetroFlow.

For an explanation of the different elements composing a Nextflow workflow, see its documentation.

The graphs generated by BioFlow-Insight are :

  • Specification graph: BioFlow-Insight reconstructs the workflow’s specification graph from its source code without having to execute it. The specification graph is defined as a directed graph where nodes are processes and operations, and edges are channels that are directed from one vertex to another (steps of the workflow are ordered). This graph represents all the possible interactions between processes and operations through channels that are defined in the workflow code. Within the specification graph, we define two types of operations: operations are categorised in two groups: the following operations defined as operations that have at least one input, and the starting operations defined as operations without any inputs.

  • Process dependency graph: BioFlow-Insight generates the process dependency graph which represents only processes (nodes) and their dependencies (edges). Similar to the dependency graph, this graph is constructed by removing all operations, leaving only processes, and linking them based on their dependencies in the original specification graph. Again in this representation, the edges no longer represent interaction between its elements, but their dependencies.

  • Metro-Map Json file: BioFlow-Insight also generates the metro-map json file, which describes the workflow in metro-map form (with process code, conditions represented by colour, etc..). This needs to be updated

To run BioFlow-Insight to obtain all the outputs, run this command:

bioflow-insight --analysis bioflow my_workflow/main.nf

If you want to simply obtain the json file which describes the Metro-Map, run this command:

bioflow-insight --analysis metroflow my_workflow/main.nf

For a more in-depth explanation of BioFlow-Insight functionnalities, visit its webpage here (https://bioflow-insight.pasteur.cloud/specification/).

Citing BioFlow-Insight

Please cite BioFlow-Insight in any research that uses or extends BioFlow-Insight.

To cite BioFlow-Insight, please use the following publication:

George Marchment, Bryan Brancotte, Marie Schmit, Frédéric Lemoine, Sarah Cohen-Boulakia, BioFlow-Insight: facilitating reuse of Nextflow workflows with structure reconstruction and visualization, NAR Genomics and Bioinformatics, Volume 6, Issue 3, September 2024, lqae092, https://doi.org/10.1093/nargab/lqae092

License

This project is licensed under the GNU Affero General Public License.

Funding

This work received support from the National Research Agency under the France 2030 program, with reference to ANR-22-PESN-0007.







Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioflow_insight-2.0.10.tar.gz (13.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bioflow_insight-2.0.10-py3-none-any.whl (13.9 MB view details)

Uploaded Python 3

File details

Details for the file bioflow_insight-2.0.10.tar.gz.

File metadata

  • Download URL: bioflow_insight-2.0.10.tar.gz
  • Upload date:
  • Size: 13.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for bioflow_insight-2.0.10.tar.gz
Algorithm Hash digest
SHA256 e8aec6c59a7908f09ac8b499de69336dfd31f21cb0415fb22ec8c68a7aeadcf2
MD5 d1ced0c2c2149c474b1b85c55b0ea5f2
BLAKE2b-256 ceb85bf32ad6d57aac0af7eff8b308c846013442368a199e559080f49c69c290

See more details on using hashes here.

File details

Details for the file bioflow_insight-2.0.10-py3-none-any.whl.

File metadata

File hashes

Hashes for bioflow_insight-2.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 ec9dfa47d67967550879d0f945868dc46159d8ad9396d0854b1b3d500d587a4e
MD5 8c5a7a9714929bc8a707094b9e8c1f78
BLAKE2b-256 2002d70b33b5c02e6c81d1c16120670288e4407e0cf0bc1cb2e95794d7dba255

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page