Skip to main content

A workflow (job) engine/pipeline for bioinformatics and scientific computing.

Project description

pypipegraph

Build status: Build Status
Documentation https://pypipegraph.readthedocs.io/en/latest/
Code style Code style: black](https://github.com/ambv/black)

Introduction

pypipegraph: is an MIT-licensed library for constructing a workflow piece by piece and executing just the parts of it that need to be (re-)done. It supports using multiple cores (SMP) and (eventually, alpha code right now) machines (cluster) and is a hybrid between a dependency tracker (think 'make') and a cluster engine.

More specifically, you construct Jobs, which encapsulate output (i.e. stuff that needs to be done), invariants (which force re-evaluation of output jobs if they change), and stuff inbetween (e.g. load data from disk).

From your point of view, you create a pypipegraph, you create jobs, chain them together, then ask the pypipegraph to run. It examines all jobs for their need to run (either because the have not been finished, or because they have been invalidated), distributes them across multiple python instances, and get's them executed in a sensible order.

It is robust against jobs dying for whatever reason (only the failed job and everything 'downstream' will be affected, independend jobs will continue running), allows you to resume at any point 'in between' jobs, and isolates jobs against each other.

pypipegraph supports Python 3 only.

30 second summary

    pypipegraph.new_pipeline()
    output_filenameA = 'sampleA.txt'
    def do_the_work():
        op = open(output_filename, 'wb').write("hello world")
    jobA = pypipegraph.FileGeneratingJob(output_filenameA, do_the_work)
    output_filenameB = 'sampleB.txt'
    def do_the_work():
         op = open(output_filenameB, 'wb').write(open(output_filenameA, 'rb').read() + ",  once again")
    jobB = pypipegraph.FileGeneratingJob(output_filenameB, do_the_work)
    jobB.depends_on(jobA)
    pypipegraph.run()
    print('the pipegraph is done and has returned control to you.')
    print('sampleA.txt contains "hello world"')
    print('sampleB.txt contains "hello world, once again")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pypipegraph, version 0.189
Filename, size File type Python version Upload date Hashes
Filename, size pypipegraph-0.189.tar.gz (108.3 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page