Skip to main content

Run bash commands with python multiprocessing. Includes a Tkinter GUI for workflow editing.

Project description

https://img.shields.io/pypi/v/workforce.svg Documentation Status Small pipeline example Complex pipeline editor view

Workforce is an application that creates and runs bash commands in the order of a graph. It serves as a desktop for terminals, allowing you to build and run pipelines of bash commands with python multiprocessing according to a graphml file.

Similar to other workflow management systems like Galaxy workflow, QIIME plugin workflows, AnADAMA2, Snakemake, Nextflow, and Make, but designed with multiuser support and a graphical interface for workflow editing.

Features

  • Graph-based workflow execution: Define bash commands as nodes in a directed graph

  • Multiuser support: Multiple clients can interact with the same workflow simultaneously

  • Server-based architecture: Workflows are served via Flask API with unique URLs

  • Event-driven execution: Dependency-aware scheduling with real-time status updates

  • Flexible edge types: Use blocking edges for strict dependencies or non-blocking edges for flexible triggering and re-execution

  • Subset execution: Run specific subgraphs or the entire workflow

  • Resume capability: Restart failed nodes and continue pipeline execution

  • Interactive GUI: Edit workflows visually with a Tkinter-based interface

  • Flexible command wrapping: Add prefixes/suffixes to commands (Docker, SSH, tmux, etc.)

Architecture Overview

Server

The server component provides a single machine-wide instance that manages multiple workspace contexts:

Server Startup: When starting a server using the CLI (python -m workforce server start):

  1. Checks if a server is already running via health check discovery (ports 5000-5100)

  2. If found, informs user and exits (enforces singleton per machine)

  3. If not found, discovers free port and starts Flask + Socket.IO server

  4. Waits for clients to connect and creates workspace contexts on-demand

Workspace Management:

  • Each workfile gets a deterministic workspace ID (SHA256 hash of absolute path)

  • Server maintains isolated ServerContext objects per workspace with: - Dedicated modification queue for serialized graph operations - Per-workspace event bus for domain events - Worker thread for processing queued mutations - Socket.IO room for event isolation

  • Contexts created on first client connect, destroyed on last disconnect

Server Operations:

  • Accepts workspace-scoped requests at /workspace/{workspace_id}/... endpoints

  • Edit API: Modify workflow structure (add/remove nodes, edges, statuses)

  • Run API: Initiate workflow execution with arguments: - nodes: Specific nodes to execute as subset - wrapper: Command prefix/suffix wrapper

  • Status updates propagate via Socket.IO room-based events

Server Shutdown: On idle (no active clients or runs):

  • Automatically shuts down after brief idle period

  • Contexts destroyed on last client disconnect

  • Next client connection auto-starts new server instance

Unified Execution Model

The system employs a unified execution model where every run is treated as a subset run:

Node Selection:

  • If specific nodes are selected (via CLI or GUI), those nodes form an induced subgraph for execution

  • If no nodes are explicitly selected: - The system first checks for failed nodes and selects them for re-execution - If there are no failed nodes, nodes with zero in-degree in the full workflow are selected - This means by default, the entire workflow is treated as the active subset

Execution Initialization: Upon initialization, the scheduler:

  1. Identifies all nodes within the target subset that have an in-degree of zero relative only to that subset

  2. Transitions these nodes to a “run” state

  3. Ensures nodes start immediately if their dependencies in the master workfile are omitted from the current run scope

Subgraph Boundaries: To prevent execution from bleeding into the rest of the workfile:

  • The scheduler strictly enforces subnetwork boundaries

  • Propagation is confined entirely to the active selection

  • When a node completes, only outgoing edges within the filtered subnetwork are evaluated

  • Edges leading to nodes outside the original subset are ignored, effectively “capping” the execution

Execution Loop and Dependency Management

Node Execution:

  1. When a node runs, its stdout and stderr are captured as node attributes

  2. These outputs are viewable from the GUI (with the ‘l’ shortcut key)

  3. Upon successful completion, an event is emitted to the run request

  4. Each event is tagged with a client ID, allowing multiple concurrent runs and GUI clients to operate without interference

Scheduler Operations:

  1. The emission triggers the scheduler to retrieve the filtered subnetwork map

  2. All valid outgoing edges (within the subnetwork) are updated to a to_run status

  3. An edge-status change event is broadcast

Dependency Checking:

  1. The status change prompts the target node to perform a dependency check

  2. The node transitions to the run state only if ALL incoming edges (within the subnetwork context) are marked as to_run

  3. Once this condition is satisfied: - The node clears the statuses from those incoming edges - Begins execution - Loops back to the capture and emission phase

This mechanism ensures the engine only advances when subset-specific dependencies are fully met.

Resume Logic

The resume functionality (Shift+R in GUI) handles failures or cancellations:

  • Replaces a node’s failed status with run

  • Re-triggers the event loop, which causes the scheduler to re-check dependencies and queue the node for execution

  • Allows the remainder of the pipeline to proceed through the normal dependency checking process

  • Strictly bounded by the subset; resume never propagates to nodes outside the original selection

  • Ensures nodes do not remain in a running state indefinitely

By ensuring clean status management and ignoring edges outside the active scope, the system guarantees a clean termination once the selected subgraph is exhausted.

Installation

Installation can be done with:

pip install workforce

Building a workforce workflow

To launch the pipeline editor, run:

wf

or:

python -m workforce

To open a previously constructed pipeline, run:

wf <PIPELINE.graphml>

If a Workfile is in the current directory:

wf

Running workforce plan

To run a plan from the GUI, click the ‘Run’ button or press ‘r’. If nodes are selected, execution starts from those nodes. Otherwise, the full pipeline is executed. Run from cli with:

wf run Workfile

Prefix and Suffix

Adding the following prefix and suffixes to the wf run command (or within gui) will add those prefix and suffixes to each command ran by the pipeline.

Wrapper Command

Description

–wrapper ‘bash -c “{}”’ –wrapper ‘bash -c ‘. env.sh ‘’ –wrapper ‘tmux send-keys {} C-m’ –wrapper ‘ssh ADDRESS {}’ –wrapper ‘parallel {} ::: FILENAMES’ –wrapper ‘docker run -it IMAGE {}’ –wrapper ‘echo {} >> commands.sh’ –wrapper ‘bash -lc “conda activate ENV && {}”’ –wrapper ‘nohup {} &’

Standard bash execution Bash execution with definition of config or other environmental settings Sends each command to a tmux session and executes it. Executes each command remotely on the specified server. Runs the pipeline on each specified filename. Executes each command inside a Docker container with an interactive TTY. Exports pipeline commands to a bash script named commands.sh. Activates a Conda environment before executing the command. Runs commands in the background.

To run specific process(es) from the editor, select the process(es) and click the ‘Run’ button (or shortcut with ‘r’ key). If no processes are selected, the entire pipeline will run. Opening the terminal with shortcut ‘t’ (or on the toolbar), you can see the output of the commands.

This is tested on mac, linux, and windows powershell and wsl2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

workforce-1.1.16.tar.gz (3.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

workforce-1.1.16-py3-none-any.whl (3.7 MB view details)

Uploaded Python 3

File details

Details for the file workforce-1.1.16.tar.gz.

File metadata

  • Download URL: workforce-1.1.16.tar.gz
  • Upload date:
  • Size: 3.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for workforce-1.1.16.tar.gz
Algorithm Hash digest
SHA256 9668286d5a60f07c1fc35694f39b821383b77cb8798b7d7a406c77946abc7061
MD5 f9a42d8c5ae0e20f6a0cc4b670cdbaa9
BLAKE2b-256 09537661d468fcecad597978871acd533cbecc10e92515e2c14b2dd19f8474c8

See more details on using hashes here.

File details

Details for the file workforce-1.1.16-py3-none-any.whl.

File metadata

  • Download URL: workforce-1.1.16-py3-none-any.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for workforce-1.1.16-py3-none-any.whl
Algorithm Hash digest
SHA256 ed9293e0b99998b8dd0c53ad2d69a37b06713b645761523d60b823ff7248cbcc
MD5 ef8f720c284bc5309caa5dc5b079ab5b
BLAKE2b-256 9308c527c1172cdaf60990b0f492670cd9a8c6e5429ac866a5954c8583d147cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page