Skip to main content

Orchestrate CWL with Prefect

Project description

Logo

Prefect CWL

A lightweight adapter that bridges the Common Workflow Language (CWL) world with Prefect. It not only executes CWL but lets you orchestrate it with Prefect’s scheduling, retries, observability, and deployments (this a WIP, actually). Execution is pluggable via backends, starting with Docker and with Kubernetes as a forthcoming option.

In this library, the atomic unit is a single CWL step (a CommandLineTool or workflow step), not an entire workflow/flow. Prefect orchestrates those steps according to the CWL-defined dependencies.

What this achieves

  • Bridge CWL and Prefect: Parse CWL, build a dependency graph, and run steps under Prefect orchestration.
  • Orchestrate, not just execute: Use Prefect’s UI, scheduling, retries, mapping, and deployments to operate CWL workloads.
  • Pluggable execution backends: Run each CWL step via Docker today; Kubernetes support is planned.

Key concepts

  • Atomic unit = CWL step: Each CWL step is executed as a Prefect task invocation via a backend. Prefect orchestrates the order and parallelism.
  • Dependency “waves”: Steps run in parallel when their dependencies are satisfied; no artificial serialization.
  • Typed IR: CWL is parsed into a typed internal representation that drives orchestration and I/O wiring.

Features

  • Parse a practical subset of CWL v1.2 (tools, workflows, requirements, inputs/outputs).
  • Build a dependency graph and infer parallel “waves”.
  • Generate a Prefect flow whose signature mirrors CWL workflow inputs.
  • Execute steps via a backend that handles containers, arguments, volumes, and exit codes.
  • Initial backend: Docker and Kubernetes.

Current limitations

  • The adapter needs the explicit WRK_DIR env variable, set up a current working directory when running Docker container/K8s Job
  • No glob supported, aside from simple folder names
  • Data among steps shall be passed with directory. Each step shall then read the previous output saved somehow to those files
  • Names for steps and input/output reference shall be the same A more in-depth list can be checked out inside the DESIGN file.

Check sample_cwl folder for those limits in practice.

Backends

  • Docker backend: Uses Prefect’s Docker primitives to pull images, mount volumes, and execute commands.
  • Kubernetes backend (WIP): Same interface; schedule Pods/Jobs to run each CWL step.

Quick start

After installing all the requirements, start Prefect Server first:

prefect server start

Then, create a new project:

mkdir this-is-just-the-client-callign
uv init

and install the library (with the uv CLI and Docker or K8s backend or both):

uv add "prefect-cwl[docker]"
uv add "prefect-cwl[k8s]"

from your shell:

from prefect_cwl import create_flow_with_docker_backend
with open("myflow.cwl") as inp:
    runnable_flow = create_flow_with_docker_backend(
        inp.read(), Path("/tmp"), workflow_id="#flow_id"
    )

asyncio.run(runnable_flow(**inputs))

The runnable_flow is a Prefect flow that can be scheduled, deployed, and run as any other Prefect flow.

Shall you want to use K8s backend, special requirements apply:

  • a running K8s cluster
  • a PVC installed and deployed and usable by Prefect
  • the following environment env vars set, if needed:
    • KUBECONFIG, for custom configuration
    • PREFECT_CWL_K8S_NAMESPACE, for custom namespace (default: prefect)
    • PREFECT_CWL_K8S_PVC_NAME, for custom PVC name (default: prefect-shared-pvc)
    • PREFECT_CWL_K8S_PVC_MOUNT_PATH, for custom PVC mount path (default: /data)
    • PREFECT_CWL_K8S_SERVICE_ACCOUNT_NAME, for custom service account name (default: prefect-flow-runner)
    • PREFECT_CWL_K8S_PULL_SECRETS, for custom pull secrets (default: [])

For running a local K8s cluster, configured with Prefect and all the above requirements, check the prefect-k8s-demo folder.

Install the library locally

Prerequisite: install uv (https://github.com/astral-sh/uv). Once uv has been installed successfully, move in the project folder and use:

uv sync --all-extras --group dev

Be sure to set the PYTHONPATH variable to prefect_cwl directory. Alternatively, use the command echo PYTHONPATH=$PWD, to set the path pointing to the current folder. Otherwise, install it into editable mode. Should you run tests, install dev dependencies.

Start the Prefect server using the command:

uv run prefect server start

Now we can run the python script using the command:

uv run <file_path>

Sample CWL (WIP)

See sample_cwl/ for ready-to-run examples you can use to test the library. These are work-in-progress and may evolve as the adapter expands CWL coverage and features.

Project status

Early-stage and evolving. Expect changes in models, supported CWL features, and backend interfaces as we harden the adapter.

Design

The package design is detailed in DESIGN.md and reflects the latest codebase, including planning vs execution for Docker and Kubernetes backends.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prefect_cwl-0.1.1.tar.gz (33.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prefect_cwl-0.1.1-py3-none-any.whl (28.6 kB view details)

Uploaded Python 3

File details

Details for the file prefect_cwl-0.1.1.tar.gz.

File metadata

  • Download URL: prefect_cwl-0.1.1.tar.gz
  • Upload date:
  • Size: 33.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for prefect_cwl-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9a50636315c2eb7865da7bef29caf90d9905491593945a05ea1c68569987898c
MD5 9d7199ce5236033840529e82544c8b5b
BLAKE2b-256 6939b0dc4ab4ce2f0f6971736bc339d2fa7b3b60448f1e8c9243b79d91f87354

See more details on using hashes here.

File details

Details for the file prefect_cwl-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for prefect_cwl-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5b32f7e843062ecf69eb3c0a1a1f5bb4666c92cace225940e6a391b32bda0a90
MD5 a243822e7689bbe4b6d78bf9e918c665
BLAKE2b-256 a11181741828d3e8f85a87ba7aeecc5bae967bc66955d6eac97161f8e5197a96

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page