Skip to main content

FAIRWorkflows python library

Project description

Build Status Coverage Status PyPI version fair-software.eu

fairworkflows python library

The goal of the fairworkflows python library is to support the construction, manipulation and publishing of FAIR scientific workflows using semantic technologies. It is developed as a component of the FAIR Workbench, as part of the FAIR is FAIR project. The focus is on description of workflows consisting of manual and computational steps using semantic technology, such as the ontology described in the publication:

Celebi, R., Moreira, J. R., Hassan, A. A., Ayyar, S., Ridder, L., Kuhn, T., & Dumontier, M. (2019). Towards FAIR protocols and workflows: The OpenPREDICT case study. arXiv:1911.09531.

The goals of the project are:

  1. To facilitate the construction of RDF descriptions of a variety of scientific 'workflows', in the most general sense. This includes experimental procedures, ipython notebooks, computational analysis of results, etc.
  2. To allow validation and publication of the resultant RDF (for example, by means of nanopublications).
  3. Re-use of previously published steps, in new workflows.
  4. FAIR data flow from end-to-end.

We seek to provide an easy-to-use python interface for achieving the above.

Installation

The most recent release can be installed from the python package index using pip:

pip install fairworkflows

Description

The fairworkflows library has a number of modules to help with FAIRifying workflows:

  • from fairworkflows import Nanopub: This module provides a python classes and methods for searching, fetching and publishing to the nanopublication servers.
  • from fairworkflows import FairStep: This class is used to create, validate and publish rdf descriptions of an individual step (that can then be used in one or more workflows). Steps may be created from an rdflib graph, a function or by passing a URI to a nanopublication that describes a workflow step.
  • from fairworkflows import FairWorkflow: This class is used to create, validate and publish rdf descriptions of a general workflow. The workflow can be constructed from FairStep objects, or loaded from a nanopublication that describes a fair workflow. FairWorkflow objects are iterators, returning their constituent FairSteps in an order specified by the step dependencies.

Quick Start

from fairworkflows import FairWorkflow, FairStep

# Create a workflow
workflow = FairWorkflow(description='This is a test workflow.')

# Load some steps from nanopublications
preheat_oven = FairStep.from_nanopub(uri='http://purl.org/np/RACLlhNijmCk4AX_2PuoBPHKfY1T6jieGaUPVFv-fWCAg#step')
melt_butter = FairStep.from_nanopub(uri='http://purl.org/np/RANBLu3UN2ngnjY5Hzrn7S5GpqFdz8_BBy92bDlt991X4#step')
arrange_chicken = FairStep.from_nanopub(uri='http://purl.org/np/RA5D8NzM2OXPZAWNlADQ8hZdVu1k0HnmVmgl20apjhU8M#step')

# Specify ordering of steps
workflow.first_step = preheat_oven
workflow.add(melt_butter, follows=preheat_oven)
workflow.add(arrange_chicken, follows=melt_butter)

# Validates?
workflow.validate()

# Iterate through all steps in the workflow 
for step in workflow:
    print(step)

# Visualize the workflow directly in a jupyter notebook
workflow.display()

Example

  • See test_plex_builder.ipynb for a current example of using the fairworkflows library to build a workflow using plex rdf

Notes

The np script needs to be called to generate rsa keys in the ~/.nanopub directory.

fairworkflows/np mkkeys -a RSA

How is the fairworkflows library expected to be used?

While this library could be used as a standalone tool to build/publish RDF workflows, it is intended more as a component to be used in a variety of other tools that seek to add FAIR elements to workflows. At present the library is used in the following tools:

  • NanopubJL: A Jupyter Lab extension that adds a widget for searching the nanopublication servers, and helps the user fetch desired nanopubs through injection of the necessary python code into a notebook cell.
  • FAIRWorkflowsExtension: A Jupyter Lab extension that adds a widget for searching for previously published FairSteps or FairWorkflows. These can then be loaded into the notebook for modification or combination into new workflows.

It is expected that the library will soon interact with FAIR Data Points as well e.g. fairdatapoint.

Relation to existing workflow formats/engines (e.g. CWL, WDL, Snakemake etc)

This library is not intended to replace or compete with the hundreds of existing computational workflow formats, but rather to aid in RDF description and comparison of workflows in the most general sense of the term (including manual experiemental steps, notebooks, and so on). Steps in a FAIRWorkflow may very well be 'run this CWL workflow' or 'run this script', so such workflows are expected to sit more on a meta-level, describing the before-and-after of running one of these fully automated computational workflows as well.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

fairworkflows-0.1.6-py3-none-any.whl (23.4 kB view details)

Uploaded Python 3

File details

Details for the file fairworkflows-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: fairworkflows-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 23.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6

File hashes

Hashes for fairworkflows-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4f884178631c08c1e6dd36180d7675c7e6f2e10b2d3fe2e34d4e51021f5416bf
MD5 e06f45db952768c9a207b5ab61069e8b
BLAKE2b-256 140f950b65f139a31e7353f85020658907e91702e59fddb6b3df500475df9b88

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page