Dflow is a Python framework for constructing scientific computing workflows employing Argo Workflows as the workflow engine.

Project description

DFLOW

Dflow is a Python framework for constructing scientific computing workflows (e.g. concurrent learning workflows) employing Argo Workflows as the workflow engine.

For dflow's users (e.g. ML application developers), dflow offers user-friendly functional programming interfaces for building their own workflows. Users need not be concerned with process control, task scheduling, observability and disaster tolerance. Users can track workflow status and handle exceptions by APIs as well as from frontend UI. Thereby users are enabled to concentrate on implementing operations (OPs) and orchestrating workflows.

For dflow's developers, dflow wraps on argo SDK, keeps details of computing and storage resources from users, and provides extension abilities. While argo is a cloud-native workflow engine, dflow uses containers to decouple computing logic and scheduling logic, and uses Kubernetes to make workflows observable, reproducible and robust. Dflow is designed to be based on a distributed, heterogeneous infrastructure. The most common computing resources in scientific computing may be HPC clusters. User can either use executor to manage HPC jobs within dflow (dflow-extender) or using DPDispatcher plugin, or use virtual node technique to uniformly manage HPC resources in the framework of Kubernetes (wlm-operator).

Dflow's OPs can be reused among workflows and shared among users. Dflow provides a cookie cutter recipe dflow-op-cutter for template a new OP package. Start developing an OP package at once from

pip install cookiecutter
cookiecutter https://github.com/deepmodeling/dflow-op-cutter.git

Dflow provides a debug mode for running workflows bare-metally whose backend is implemented in dflow in pure Python, independent of Argo/Kubernetes. The debug mode uses local environment to execute OPs instead of containers. It implements most APIs of the default mode in order to provide an identical user experience. The debug mode offer convenience for debugging or testing without container. For the clusters having problem deploying docker and Kubernetes and difficult to access from outside, the debug mode may also be used for production, despite less robustness and observability.

1. Overview
- 1.1. Architecture
- 1.2. Common layer
  - 1.2.1. Parameters and artifacts
  - 1.2.2. OP template
  - 1.2.3. Step
  - 1.2.4. Workflow
- 1.3. Interface layer
  - 1.3.1. Python OP
1. Quick Start
- 2.1. Prepare Kubernetes cluster
- 2.2. Setup Argo Workflows
- 2.3. Install dflow
- 2.4. Run an example
1. User Guide (dflow-doc)
- 3.1. Common layer
  - 3.1.1. Workflow management
  - 3.1.2. Upload/download artifact
  - 3.1.3. Steps
  - 3.1.4. DAG
  - 3.1.5. Output parameters and artifacts of Steps
  - 3.1.6. Conditional step, parameters and artifacts
  - 3.1.7. Produce parallel steps using loop
  - 3.1.8. Timeout
  - 3.1.9. Continue on failed
  - 3.1.10. Continue on success number/ratio of parallel steps
  - 3.1.11. Optional input artifacts
  - 3.1.12. Default value for output parameters
  - 3.1.13. Key of a step
  - 3.1.14. Resubmit a workflow
  - 3.1.15. Executor
  - 3.1.16. Submit Slurm job via slurm executor
  - 3.1.17. Submit HPC job via dispatcher plugin
  - 3.1.18. Submit Slurm job via virtual node
  - 3.1.19. Use resources in Kubernetes
  - 3.1.20. Important note: variable names
  - 3.1.21. Debug mode: dflow independent of Kubernetes
- 3.2. Interface layer
  - 3.2.1. Slices
  - 3.2.2. Retry and error handling
  - 3.2.3. Progress
  - 3.2.4. Upload python packages for development

1. Overview

1.1. Architecture

The dflow consists of a common layer and an interface layer. Interface layer takes various OP templates from users, usually in the form of python classes, and transforms them into base OP templates that common layer can handle.

dflow_architecture

1.2. Common layer

Common layer is an extension over argo client which provides functionalities such as file processing, computing resources management, workflow submission and management, etc.

1.2.1. Parameters and artifacts

Parameters and artifacts are data stored by the workflow and passed within the workflow. Parameters are saved as strings which can be displayed in the UI, while artifacts are saved as files.

1.2.2. OP template

OP template (shown as base OP in the figure above) is the fundamental building block of a workflow. It defines an operation to be executed given the input and output. Both the input and output can be parameters and/or artifacts. The most common OP template is the container OP template. Necessary arguments to be defined for the operation are the container image and scripts to be executed. Currently, two types of container OP templates are supported: ShellOPTemplate, PythonScriptOPTemplate. Shell OP template (ShellOPTemplate) defines an operation by a shell script and Python script OP template (PythonScriptOPTemplate) defines an operation by a Python script.

To use the ShellOPTemplate:

from dflow import ShellOPTemplate

simple_example_templ = ShellOPTemplate(
    name="Hello",
    image="alpine:latest",
    script="cp /tmp/foo.txt /tmp/bar.txt && echo {{inputs.parameters.msg}} > /tmp/msg.txt",
)

The above example defines a ShellOPTemplate with name = "Hello" and container image alpine:latest. The operation is to copy /tmp/foo.txt (input artifacts) to /tmp/bar.txt (output artifacts) and printout the properties of the parameters with name msg (input parameters) and redirect it to /tmp/msg.txt (value in the file is the properties of the output parameters).

To define the parameters and artifacts of this OPTemplate:

from dflow import InputParameter, InputArtifact, OutputParameter, OutputArtifact

# define input
simple_example_templ.inputs.parameters = {"msg": InputParameter()}
simple_example_templ.inputs.artifacts = {"inp_art": InputArtifact(path="/tmp/foo.txt")}
# define output
simple_example_templ.outputs.parameters = {
    "msg": OutputParameter(value_from_path="/tmp/msg.txt")
}
simple_example_templ.outputs.artifacts = {
    "out_art": OutputArtifact(path="/tmp/bar.txt")
}

In the above example, there are three things to clarify.

The value of the input parameter is optional for the OP template, if provided, it will be regarded as the default value which can be overridden at run time.
For the output parameter, the source where its value comes from should be specified. For the container OP template, the value may come from a certain file generated in the container (value_from_path).
The paths to the input and output artifact in the container are required to be specified.

On the same level, one can also define a PythonScriptOPTemplate to achieve the same operation.

1.2.3. Step

Step is the central block for building a workflow. A step is created by instantiating an OP template. When a step is initialized, values of all input parameters and sources of all input artifacts declared in the OP template must be specified.

from dflow import Step

simple_example_step = Step(
    name="step0",
    template=simple_example_templ,
    parameters={"msg": "HelloWorld!"},
    artifacts={"inp_art": foo},
)

This step will instantiate the OP template created in 1.2.2. Note that foo is an artifact either uploaded from local or output of another step.

1.2.4. Workflow

Workflow is the connecting block for building a workflow. A workflow is created by adding steps together.

from dflow import Workflow

wf = Workflow(name="hello-world")
wf.add(simple_example_step)

Submit a workflow by

wf.submit()

One can also add a list of steps to a workflow to make them run in parallel

wf.add([hello2, hello3])

An example using all the elements discussed in 1.2 is shown here:

ShellOP example

1.3. Interface layer

Interface layer handles more Python-native OPs defined in the form of class.

1.3.1. Python OP

PythonOPTemplate is another kind of OP template. It inherits from PythonScriptOPTemplate but allows users to define operation (OP) in the form of a Python class. As Python is a weak typed language, we impose strict type checking to PythonOP to alleviate ambiguity and unexpected behaviors.

The structures of the inputs and outputs of a PythonOP are defined in the static methods get_input_sign and get_output_sign. Each of them returns a OPIOSign object, which is a dictionary mapping from the name of a parameter/artifact to its sign.

The execution of the PythonOP is defined in the execute method. The execute method receives a OPIO object as input and outputs a OPIO object. OPIO is a dictionary mapping from the name of a parameter/artifact to its value/path. The type of the parameter value or the artifact path should be in accord with that declared in the sign. Type checking is implemented before and after the execute method.

from dflow.python import OP, OPIO, OPIOSign, Artifact
from pathlib import Path
import shutil


class SimpleExample(OP):
    def __init__(self):
        pass

    @classmethod
    def get_input_sign(cls):
        return OPIOSign(
            {
                "msg": str,
                "inp_art": Artifact(Path),
            }
        )

    @classmethod
    def get_output_sign(cls):
        return OPIOSign(
            {
                "msg": str,
                "out_art": Artifact(Path),
            }
        )

    @OP.exec_sign_check
    def execute(
        self,
        op_in: OPIO,
    ) -> OPIO:
        shutil.copy(op_in["inp_art"], "bar.txt")
        out_msg = op_in["msg"]
        op_out = OPIO(
            {
                "msg": out_msg,
                "out_art": Path("bar.txt"),
            }
        )
        return op_out

The above example defines an OP SimpleExample. The operation is to copy foo.txt to bar.txt and write the properties of the parameters with name msg to msg.txt.

One may also define OP using decorator @OP.function and Python Annotation as:

from dflow.python import OP, Artifact
from pathlib import Path
import shutil

@OP.function
def SimpleExample(
		msg: str,
		inp_art: Artifact(Path),
)->{"msg": str, "out_art": Artifact(Path),}:
    shutil.copy(inp_art, "bar.txt")
    out_msg = msg
    return {"msg": out_msg, "out_art": Path("bar.txt"),}

We recommend python>=3.9 to use this syntax sugar. See more about Python Annotation at Python official howtos.

To use the above class as a PythonOPTemplate, we need to pass the above class to PythonOPTemplate and specify the container image. Note that pydflow must be installed in this image

from dflow.python import PythonOPTemplate

simple_example_templ = PythonOPTemplate(SimpleExample, image="python:3.8")

An example using all the elements discussed in 1.3 is shown here:

PythonOP example

2. Quick Start

2.1. Prepare Kubernetes cluster

Firstly, you will need a Kubernetes cluster. To setup a Kubernetes cluster on your laptop, you can download the Minikube on your PC and make sure you have Docker up and running on you PC.

After downloading, you can initiate the Kubernetes cluster using:

minikube start

2.2. Setup Argo Workflows

To get started quickly, you can use the quick start manifest. It will install Argo Workflow as well as some commonly used components:

kubectl create ns argo
kubectl apply -n argo -f https://raw.githubusercontent.com/deepmodeling/dflow/master/manifests/quick-start-postgres.yaml

If you are running Argo Workflows locally (e.g. using Minikube or Docker for Desktop), open a port-forward so you can access the namespace:

kubectl -n argo port-forward deployment/argo-server 2746:2746 --address 0.0.0.0

This will serve the user interface on https://localhost:2746

For access to the minio object storage, open a port-forward for minio

kubectl -n argo port-forward deployment/minio 9000:9000 --address 0.0.0.0

2.3. Install dflow

Make sure your Python version is not less than 3.6 and install dflow

pip install pydflow

2.4. Run an example

Submit a simple workflow

python examples/test_steps.py

Then you can check the submitted workflow through argo's UI.

3. User Guide (dflow-doc)

3.1. Common layer

3.1.1. Workflow management

After submitting a workflow by wf.submit(), or getting a history workflow by wf = Workflow(id="xxx"), one can track its real-time status with APIs

wf.id: workflow ID in argo
wf.query_status(): query workflow status, return "Pending", "Running", "Succeeded", etc.
wf.query_step(name=None): query step by name (support for regex), return a list of argo step objects
- step.phase: phase of a step, "Pending", "Running", Succeeded, etc.
- step.outputs.parameters: a dictionary of output parameters
- step.outputs.artifacts: a dictionary of output artifacts

3.1.2. Upload/download artifact

Dflow offers tools for uploading files to Minio and downloading files from Minio (default object storage in the quick start). User can upload a list of files or directories and get an artifact object, which can be used as argument of a step

artifact = upload_artifact([path1, path2])
step = Step(
    ...
    artifacts={"foo": artifact}
)

User can also download the output artifact of a step queried from a workflow (to current directory for default)

step = wf.query_step(name="hello")
download_artifact(step.outputs.artifacts["bar"])

Modify dflow.s3_config to configure S3 globally.

Note: dflow retains the relative path of the uploaded file/directory with respect to the current directory during uploading. If file/directory outside current directory is uploaded, its absolute path is used as the relative path in the artifact. If you want a different directory structure in the artifact with the local one, you can make soft links and then upload.

3.1.3. Steps

Steps is another kind of OP template which is defined by its constituent steps instead of a container. It can be seen as a sub-workflow or a super OP template consisting of some smaller OPs. Steps is a sequential array of concurrent Step. A simple example goes like [[s00,s01],[s10,s11,s12]], where inner array represent concurrent tasks while outer array is sequential. Add a step to a steps just like for a workflow

steps.add(step)

Steps can be used as the template to define a bigger step. Thus one can construct complex workflows of nested structure. One is also allowed to recursively use a Steps as the template of a building bloack inside it self to achieve dynamic loop.

Recursive example

3.1.4. DAG

DAG is another kind of OP template which is defined by its constituent tasks and their dependencies. The usage of DAG is similar to that of steps. To add a task to a DAG, use

dag.add(task)

The usage of task is also similar to that of step. Dflow will automatically detect dependencies among tasks of a DAG (from input/output relations). Additional dependencies can be declared by

task_3 = Task(..., dependencies=[task_1, task_2])

DAG example

3.1.5. Output parameters and artifacts of Steps

The output parameter of a Steps can be set to come from a step of it by steps.outputs.parameters["msg"].value_from_parameter = step.outputs.parameters["msg"]. Here, step must be contained in steps. For assigning output artifact for a Steps, use steps.outputs.artifacts["foo"]._from = step.outputs.parameters["foo"].

3.1.6. Conditional step, parameters and artifacts

Set a step to be conditional by Step(..., when=expr) where expr is an boolean expression in string format. Such as "%s < %s" % (par1, par2). The when argument is often used as the breaking condition of recursive steps. The output parameter of a Steps can be assigned as optional by

steps.outputs.parameters["msg"].value_from_expression = if_expression(
    _if=par1 < par2,
    _then=par3,
    _else=par4
)

Similarly, the output artifact of a Steps can be assigned as optional by

steps.outputs.artifacts["foo"].from_expression = if_expression(
    _if=par1 < par2,
    _then=art1,
    _else=art2
)

Conditional outputs example

3.1.7. Produce parallel steps using loop

with_param and with_sequence are 2 arguments of Step for automatically generating a list of parallel steps. These steps share a common OP template, and only differ in the input parameters.

A step using with_param option generates parallel steps on a list (either a constant list or referring to another parameter, e.g. an output parameter of another step or an input parameter of the steps or DAG context), the parallelism equals to the length of the list. Each parallel step picks an item from the list by "{{item}}", such as

step = Step(
    ...
    parameters={"msg": "{{item}}"},
    with_param=steps.inputs.parameters["msg_list"]
)

A step using with_sequence option generates parallel steps on a numeric sequence. with_sequence is usually used in coordination with argo_sequence which returns an Argo's sequence. For argo_sequence, the number at which to start the sequence is specified by start (default: 0). One can either specify the number of elements in the sequence by count or the number at which to end the sequence by end. The printf format string can be specified by format to format the value in the sequence. Each argument can be passed with a parameter, argo_len which returns the length of a list may be useful. Each parallel step picks an element from the sequence by "{{item}}", such as

step = Step(
    ...
    parameters={"i": "{{item}}"},
    with_sequence=argo_sequence(argo_len(steps.inputs.parameters["msg_list"]))
)

3.1.8. Timeout

Set the timeout of a step by Step(..., timeout=t). The unit is second.

Timeout example

3.1.9. Continue on failed

Set the workflow to continue when a step fails by Step(..., continue_on_failed=True).

Continue on failed example

3.1.10. Continue on success number/ratio of parallel steps

Set the workflow to continue when certain number/ratio of parallel steps succeed by Step(..., continue_on_num_success=n) or Step(..., continue_on_success_ratio=r).

Continue on success ratio example

3.1.11. Optional input artifacts

Set an input artifact to be optional by op_template.inputs.artifacts["foo"].optional = True.

3.1.12. Default value for output parameters

Set default value for an output parameter by op_template.outputs.parameters["msg"].default = default_value. The default value will be used when the expression in value_from_expression fails or the step is skipped.

3.1.13. Key of a step

You can set a key for a step by Step(..., key="some-key") for the convenience of locating the step. The key can be regarded as an input parameter which may contain reference of other parameters. For instance, the key of a step can change with iterations of a dynamic loop. Once key is assigned to a step, the step can be query by wf.query_step(key="some-key"). If the key is unique within the workflow, the query_step method returns a list consist of only one element.

Key of step example

3.1.14. Resubmit a workflow

Workflows often have some steps that are expensive to compute. The outputs of previously run steps can be reused for submitting a new workflow. E.g. a failed workflow can be restarted from a certain point after some modification of the workflow template or even outputs of completed steps. For example, submit a workflow with reused steps by wf.submit(reuse_step=[step0, step1]). Here, step0 and step1 are previously run steps returned by query_step method. Before the new workflow runs a step, it will detect if there exists a reused step whose key matches that of the step about to run. If hit, the workflow will skip the step and set its outputs as those of the reused step. To modify outputs of a step before reusing, use step0.modify_output_parameter(par_name, value) for parameters and step0.modify_output_artifact(art_name, artifact) for artifacts.

Reuse step example

3.1.15. Executor

By default, for a "script step" (a step whose template is a script OP template), the Shell script or Python script runs in the container directly. Alternatively, one can modify the executor to run the script. Dflow offers an extension point for "script step" Step(..., executor=my_executor). Here, my_executor should be an instance of class derived from Executor. A Executor-derived class should implement a method render which converts original template to a new template.

class Executor(object):
    def render(self, template):
        pass

A context is similar to an executor, but assigned to a workflow Workflow(context=...) and affect every step.

3.1.16. Submit Slurm job via slurm executor

SlurmRemoteExecutor is provided as an example of executor. The executor submits a slurm job to a remote host and synchronize its status and logs to the dflow step. The central logic of the executor is implemented in the Golang project Dflow-extender. If you want to run a step on a slurm cluster remotely, do something like

Step(
    ...,
    executor=SlurmRemoteExecutor(host="1.2.3.4",
        username="myuser",
        header="""#!/bin/bash
                  #SBATCH -N 1
                  #SBATCH -n 1
                  #SBATCH -p cpu""")
)

There are 3 options for SSH authentication, using password, specify path of private key file locally, or upload authorized private key to each node (or equivalently add each node to the authorized host list).

Slurm executor example

3.1.17. Submit HPC job via dispatcher plugin

DPDispatcher is a python package used to generate HPC scheduler systems (Slurm/PBS/LSF) jobs input scripts and submit these scripts to HPC systems and poke until they finish. Dflow provides simple interface to invoke dispatcher as executor to complete script steps. E.g.

from dflow.plugins.dispatcher import DispatcherExecutor
Step(
    ...,
    executor=DispatcherExecutor(host="1.2.3.4",
        username="myuser",
        queue_name="V100")
)

For SSH authentication, one can either specify path of private key file locally, or upload authorized private key to each node (or equivalently add each node to the authorized host list). For configuring extra machine, resources or task parameters for dispatcher, use DispatcherExecutor(..., machine_dict=m, resources_dict=r, task_dict=t).

Dispatcher executor example

3.1.18. Submit Slurm job via virtual node

Following the installation steps in the wlm-operator project to add Slurm partitions as virtual nodes to Kubernetes (use manifests configurator.yaml, operator-rbac.yaml, operator.yaml in this project which modified some RBAC configurations)

$ kubectl get nodes
NAME                            STATUS   ROLES                  AGE    VERSION
minikube                        Ready    control-plane,master   49d    v1.22.3
slurm-minikube-cpu              Ready    agent                  131m   v1.13.1-vk-N/A
slurm-minikube-dplc-ai-v100x8   Ready    agent                  131m   v1.13.1-vk-N/A
slurm-minikube-v100             Ready    agent                  131m   v1.13.1-vk-N/A

Then you can assign a step to be executed on a virtual node (i.e. submit a Slurm job to the corresponding partition to complete the step)

step = Step(
    ...
    executor=SlurmJobTemplate(
        header="#!/bin/sh\n#SBATCH --nodes=1",
        node_selector={"kubernetes.io/hostname": "slurm-minikube-v100"}
    )
)

3.1.19. Use resources in Kubernetes

A step can also be completed by a Kubernetes resource (e.g. Job or custom resources). At the beginning, a manifest is applied to Kubernetes. Then the status of the resource is monitered until the success condition or the failure condition is satisfied.

class Resource(object):
    action = None
    success_condition = None
    failure_condition = None
    def get_manifest(self, template):
        pass

Wlm example

3.1.20. Important note: variable names

Dflow has following restrictions on variable names.

Variable name	Static/Dynamic	Restrictions	Example
Workflow/OP template name	Static	Lowercase RFC 1123 subdomain (must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character	my-name
Step/Task name	Static	Must consist of alpha-numeric characters or '-', and must start with an alpha-numeric character	My-name1-2, 123-NAME
Parameter/Artifact name	Static	Must consist of alpha-numeric characters, '_' or '-'	my_param_1, MY-PARAM-1
Key name	Dynamic	Lowercase RFC 1123 subdomain (must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character	my-name

3.1.21. Debug mode: dflow independent of Kubernetes

The debug mode is enabled by setting

from dflow import config
config["mode"] = "debug"

Before running a workflow locally, make sure that the dependencies of all OPs in the workflow are well-configured in the locally environment, unless the dispatcher executor is employed to submit jobs to some remote environments. The debug mode uses the current directory as the working directory by default. Each workflow will create a new directory there, whose structure will be like

python-lsev6
├── status
└── step-penf5
    ├── inputs
    │   ├── artifacts
    │   │   ├── dflow_python_packages
    │   │   ├── foo
    │   │   └── idir
    │   └── parameters
    │       ├── msg
    │       └── num
    ├── log.txt
    ├── outputs
    │   ├── artifacts
    │   │   ├── bar
    │   │   └── odir
    │   └── parameters
    │       └── msg
    ├── phase
    ├── script
    ├── type
    └── workdir
        ├── ...

The top level contains the status and all steps of the workflow. The directory name for each step will be its key if provided, or generated from its name otherwise. The step directory contains the input/output parameters/artifacts, the type and the phase of the step. For a step of type "Pod", its directory also includes the script, the log file and the working directory for the step.

Debug mode examples

3.2. Interface layer

3.2.1. Slices

Slices helps user to slice input parameters/artifacts (which must be lists) to feed parallel steps and stack their output parameters/artifacts to lists in the same pattern. For example,

step = Step(name="parallel-tasks",
    template=PythonOPTemplate(
        ...,
        slices=Slices("{{item}}",
            input_parameter=["msg"],
            input_artifact=["data"],
            output_artifact=["log"])
    ),
    parameters = {
        "msg": msg_list
    },
    artifacts={
        "data": data_list
    },
    with_param=argo_range(5)
)

In this example, each item in msg_list is passed to a parallel step as the input parameter msg, each part in data_list is passed to a parallel step as the input artifact data. Finally, the output artifacts log of all parallel steps are collected to one artifact step.outputs.artifacts["log"].

Slices example

It should be noticed that this feature by default passes full input artifacts to each parallel step which may only use some slices of these artifacts. In comparison, the subpath mode of slices only passes one single slice of the input artifacts to each parallel step. To use the subpath mode of slices,

step = Step(name="parallel-tasks",
    template=PythonOPTemplate(
        ...,
        slices=Slices(sub_path=True,
            input_parameter=["msg"],
            input_artifact=["data"],
            output_artifact=["log"])
    ),
    parameters = {
        "msg": msg_list
    },
    artifacts={
        "data": data_list
    })

Here, the slice pattern ({{item}}) of PythonOPTemplate and the with_param argument of the Step need not to be set, because they are fixed in this mode. Each input parameter and artifact to be sliced must be of the same length, and the parallelism equals to this length. Another noticeable point is that in order to use the subpath of the artifacts, these artifacts must be saved without compression when they are generated. E.g. declare Artifact(..., archive=None) in the output signs of Python OP, or specify upload_artifact(..., archive=None) while uploading artifacts. Besides, one can use dflow.config["archive_mode"] = None to set default archive mode to no compression globally.

Subpath mode of slices example

3.2.2. Retry and error handling

Dflow catches TransientError and FatalError thrown from OP. User can set maximum number of retries on TransientError by PythonOPTemplate(..., retry_on_transient_error=n). Timeout error is regarded as fatal error for default. To treat timeout error as transient error, set PythonOPTemplate(..., timeout_as_transient_error=True).

Retry example

3.2.3. Progress

A OP can update progress in the runtime so that user can track its real-time progress

class Progress(OP):
    progress_total = 100
    ...
    def execute(op_in):
        for i in range(10):
            self.progress_current = 10 * (i + 1)
            ...

Progress example

3.2.4. Upload python packages for development

To avoid frequently making image during development, dflow offers an interface to upload local packages into container and add them to $PYTHONPATH, such as PythonOPTemplate(..., python_packages=["/opt/anaconda3/lib/python3.9/site-packages/numpy"]). One can also globally specify packages to be uploaded, which will affect all OPs

from dflow.python import upload_packages
upload_packages.append("/opt/anaconda3/lib/python3.9/site-packages/numpy")

Project details

Release history Release notifications | RSS feed

1.8.65

May 18, 2024

1.8.64

Apr 30, 2024

1.8.63

Apr 30, 2024

1.8.62

Apr 12, 2024

1.8.61

Apr 7, 2024

1.8.60

Apr 1, 2024

1.8.59

Apr 1, 2024

1.8.58

Mar 29, 2024

1.8.57

Mar 28, 2024

1.8.56

Mar 26, 2024

1.8.55

Mar 18, 2024

1.8.54

Mar 4, 2024

1.8.53

Mar 4, 2024

1.8.52

Feb 28, 2024

1.8.51

Feb 23, 2024

1.8.50

Feb 22, 2024

1.8.49

Feb 22, 2024

1.8.48

Feb 13, 2024

1.8.47

Feb 8, 2024

1.8.46

Feb 6, 2024

1.8.45

Jan 15, 2024

1.8.44

Jan 5, 2024

1.8.43

Jan 5, 2024

1.8.42

Jan 4, 2024

1.8.41

Jan 3, 2024

1.8.40

Dec 29, 2023

1.8.39

Dec 27, 2023

1.8.38

Dec 27, 2023

1.8.37

Dec 27, 2023

1.8.36

Dec 26, 2023

1.8.35

Dec 22, 2023

1.8.34

Dec 21, 2023

1.8.33

Dec 21, 2023

1.8.32

Dec 20, 2023

1.8.31

Dec 20, 2023

1.8.30

Dec 20, 2023

1.8.29

Dec 19, 2023

1.8.28

Dec 19, 2023

1.8.27

Nov 27, 2023

1.8.26

Nov 23, 2023

1.8.25

Nov 22, 2023

1.8.24

Nov 20, 2023

1.8.23

Nov 17, 2023

1.8.22

Nov 14, 2023

1.8.21

Nov 9, 2023

1.8.20

Nov 9, 2023

1.8.19

Nov 4, 2023

1.8.18

Oct 31, 2023

1.8.17

Oct 26, 2023

1.8.16

Oct 24, 2023

1.8.15

Oct 18, 2023

1.8.14

Oct 17, 2023

1.8.13

Oct 16, 2023

1.8.12

Oct 14, 2023

1.8.11

Oct 14, 2023

1.8.10

Oct 7, 2023

1.8.9

Oct 7, 2023

1.8.8

Oct 7, 2023

1.8.7

Oct 7, 2023

1.8.6

Oct 6, 2023

1.8.5

Sep 28, 2023

1.8.4

Sep 27, 2023

1.8.3

Sep 27, 2023

1.8.2

Sep 26, 2023

1.8.1

Sep 25, 2023

1.8.0

Sep 23, 2023

1.7.86

Sep 21, 2023

1.7.85

Sep 20, 2023

1.7.84

Sep 20, 2023

1.7.83

Sep 16, 2023

1.7.82

Sep 14, 2023

1.7.81

Sep 12, 2023

1.7.80

Sep 11, 2023

1.7.79

Sep 5, 2023

1.7.78

Sep 5, 2023

1.7.77

Aug 31, 2023

1.7.76

Aug 30, 2023

1.7.76a1 pre-release

Nov 1, 2023

1.7.76a0 pre-release

Oct 15, 2023

1.7.75

Aug 29, 2023

1.7.74

Aug 27, 2023

1.7.73

Aug 25, 2023

1.7.72

Aug 25, 2023

1.7.71

Aug 25, 2023

1.7.70

Aug 25, 2023

1.7.69

Aug 23, 2023

1.7.68

Aug 22, 2023

1.7.67

Aug 21, 2023

1.7.66

Aug 21, 2023

1.7.65

Aug 17, 2023

1.7.64

Aug 16, 2023

1.7.63

Aug 16, 2023

1.7.62

Aug 15, 2023

1.7.61

Aug 15, 2023

1.7.60

Aug 15, 2023

1.7.59

Aug 15, 2023

1.7.58

Aug 15, 2023

1.7.57

Aug 14, 2023

1.7.56

Aug 14, 2023

1.7.55

Aug 11, 2023

1.7.54

Aug 11, 2023

1.7.53

Aug 11, 2023

1.7.52

Aug 10, 2023

1.7.51

Aug 10, 2023

1.7.50

Aug 9, 2023

1.7.49

Aug 9, 2023

1.7.48

Aug 9, 2023

1.7.47

Aug 9, 2023

1.7.46

Aug 8, 2023

1.7.45

Aug 8, 2023

1.7.44

Aug 4, 2023

1.7.43

Aug 4, 2023

1.7.42

Aug 3, 2023

1.7.41

Aug 2, 2023

1.7.40

Jul 29, 2023

1.7.39

Jul 29, 2023

1.7.38

Jul 28, 2023

1.7.37

Jul 28, 2023

1.7.36

Jul 27, 2023

1.7.35

Jul 26, 2023

1.7.34

Jul 21, 2023

1.7.33

Jul 20, 2023

1.7.32

Jul 19, 2023

1.7.31

Jul 19, 2023

1.7.30

Jul 17, 2023

1.7.29

Jul 14, 2023

1.7.28

Jul 13, 2023

1.7.27

Jul 12, 2023

1.7.26

Jul 12, 2023

1.7.25

Jul 11, 2023

1.7.24

Jul 11, 2023

1.7.23

Jul 10, 2023

1.7.22

Jul 7, 2023

1.7.21

Jul 7, 2023

1.7.20

Jul 7, 2023

1.7.19

Jul 7, 2023

1.7.18

Jul 7, 2023

1.7.17

Jul 6, 2023

1.7.16

Jul 6, 2023

1.7.15

Jul 6, 2023

1.7.14

Jul 4, 2023

1.7.13

Jul 4, 2023

1.7.12

Jul 2, 2023

1.7.11

Jun 30, 2023

1.7.10

Jun 29, 2023

1.7.9

Jun 29, 2023

1.7.8

Jun 29, 2023

1.7.7

Jun 28, 2023

1.7.6

Jun 27, 2023

1.7.5

Jun 19, 2023

1.7.4

Jun 14, 2023

1.7.3

Jun 13, 2023

1.7.2

Jun 13, 2023

1.7.1

Jun 12, 2023

1.7.0

Jun 12, 2023

1.6.148

Jun 12, 2023

1.6.147

Jun 12, 2023

1.6.146

Jun 11, 2023

1.6.145

Jun 10, 2023

1.6.143

Jun 9, 2023

1.6.142

Jun 8, 2023

1.6.141

Jun 7, 2023

1.6.140

May 30, 2023

1.6.139

May 29, 2023

1.6.138

May 29, 2023

1.6.137

May 29, 2023

1.6.136

May 29, 2023

1.6.135

May 29, 2023

1.6.134

May 25, 2023

1.6.133

May 24, 2023

1.6.132

May 23, 2023

1.6.131

May 22, 2023

1.6.130

May 20, 2023

1.6.129

May 18, 2023

1.6.128

May 17, 2023

1.6.127

May 17, 2023

1.6.126

May 14, 2023

1.6.125

May 12, 2023

1.6.124

May 12, 2023

1.6.123

May 5, 2023

1.6.122

May 5, 2023

1.6.121

May 4, 2023

1.6.120

May 4, 2023

1.6.119

Apr 26, 2023

1.6.118

Apr 26, 2023

1.6.117

Apr 24, 2023

1.6.116

Apr 23, 2023

1.6.115

Apr 23, 2023

1.6.114

Apr 22, 2023

1.6.113

Apr 22, 2023

1.6.112

Apr 19, 2023

1.6.111

Apr 19, 2023

1.6.110

Apr 18, 2023

1.6.109

Apr 17, 2023

1.6.108

Apr 17, 2023

1.6.107

Apr 13, 2023

1.6.106

Apr 13, 2023

1.6.105

Apr 12, 2023

1.6.104

Apr 5, 2023

1.6.103

Apr 5, 2023

1.6.102

Apr 3, 2023

1.6.101

Apr 3, 2023

1.6.100

Mar 29, 2023

1.6.99

Mar 28, 2023

1.6.98

Mar 28, 2023

1.6.97

Mar 25, 2023

1.6.96

Mar 25, 2023

1.6.95

Mar 25, 2023

1.6.94

Mar 24, 2023

1.6.93

Mar 24, 2023

1.6.92

Mar 24, 2023

1.6.91

Mar 22, 2023

1.6.90

Mar 21, 2023

1.6.89

Mar 21, 2023

1.6.88

Mar 21, 2023

1.6.87

Mar 20, 2023

1.6.86

Mar 19, 2023

1.6.85

Mar 17, 2023

1.6.84

Mar 17, 2023

1.6.83

Mar 16, 2023

1.6.82

Mar 16, 2023

1.6.81

Mar 15, 2023

1.6.80

Mar 15, 2023

1.6.79

Mar 15, 2023

1.6.78

Mar 15, 2023

1.6.77

Mar 15, 2023

1.6.76

Mar 14, 2023

1.6.75

Mar 13, 2023

1.6.74

Mar 10, 2023

1.6.73

Mar 9, 2023

1.6.72

Mar 9, 2023

1.6.71

Mar 7, 2023

1.6.70

Mar 2, 2023

1.6.69

Mar 2, 2023

1.6.68

Mar 2, 2023

1.6.67

Mar 1, 2023

1.6.66

Mar 1, 2023

1.6.65

Mar 1, 2023

1.6.64

Feb 28, 2023

1.6.63

Feb 28, 2023

1.6.62

Feb 27, 2023

1.6.61

Feb 24, 2023

1.6.60

Feb 24, 2023

1.6.59

Feb 24, 2023

1.6.58

Feb 23, 2023

1.6.57

Feb 22, 2023

1.6.56

Feb 21, 2023

1.6.55

Feb 21, 2023

1.6.54

Feb 21, 2023

1.6.53

Feb 16, 2023

1.6.52

Feb 16, 2023

1.6.51

Feb 16, 2023

1.6.50

Feb 15, 2023

1.6.49

Feb 15, 2023

1.6.48

Feb 14, 2023

1.6.47

Feb 13, 2023

1.6.46

Feb 10, 2023

1.6.45

Feb 10, 2023

1.6.44

Feb 9, 2023

1.6.43

Feb 9, 2023

1.6.42

Feb 7, 2023

1.6.41

Feb 6, 2023

1.6.40

Feb 6, 2023

1.6.39

Feb 3, 2023

1.6.38

Feb 3, 2023

1.6.37

Feb 2, 2023

1.6.36

Feb 2, 2023

1.6.35

Feb 1, 2023

1.6.34

Jan 31, 2023

1.6.33

Jan 29, 2023

1.6.32

Jan 20, 2023

1.6.31

Jan 18, 2023

1.6.30

Jan 17, 2023

1.6.29

Jan 17, 2023

1.6.28

Jan 7, 2023

1.6.27

Jan 4, 2023

1.6.26

Jan 3, 2023

1.6.25

Dec 31, 2022

1.6.24

Dec 30, 2022

1.6.23

Dec 28, 2022

1.6.22

Dec 28, 2022

1.6.21

Dec 28, 2022

1.6.20

Dec 27, 2022

1.6.19

Dec 27, 2022

1.6.18

Dec 20, 2022

1.6.17

Dec 15, 2022

1.6.16

Dec 15, 2022

1.6.15

Dec 14, 2022

1.6.14

Dec 9, 2022

1.6.13

Dec 8, 2022

1.6.12

Dec 4, 2022

1.6.11

Dec 2, 2022

1.6.10

Nov 30, 2022

1.6.9

Nov 21, 2022

1.6.8

Nov 21, 2022

1.6.7

Nov 18, 2022

This version

1.6.6

Nov 18, 2022

1.6.5

Nov 11, 2022

1.6.4

Nov 4, 2022

1.6.3

Nov 4, 2022

1.6.2

Nov 2, 2022

1.6.1

Nov 1, 2022

1.6.0

Oct 26, 2022

1.5.14

Oct 19, 2022

1.5.13

Oct 13, 2022

1.5.12

Oct 13, 2022

1.5.11

Oct 13, 2022

1.5.10

Oct 3, 2022

1.5.9

Oct 3, 2022

1.5.8

Sep 28, 2022

1.5.7

Sep 24, 2022

1.5.6

Sep 22, 2022

1.5.5

Sep 20, 2022

1.5.4

Sep 19, 2022

1.5.3

Sep 19, 2022

1.5.2

Sep 18, 2022

1.5.1

Sep 16, 2022

1.5.0

Sep 7, 2022

1.4.0

Sep 4, 2022

1.3.1

Sep 1, 2022

1.3.0

Aug 10, 2022

1.2.7

Aug 7, 2022

1.2.6

Aug 5, 2022

1.2.5

Aug 3, 2022

1.2.4

Aug 1, 2022

1.2.3

Jul 27, 2022

1.2.2

Jul 27, 2022

1.2.1

Jul 24, 2022

1.2.0

Jul 24, 2022

1.1.20

Jul 21, 2022

1.1.19

Jul 19, 2022

1.1.18

Jul 16, 2022

1.1.17

Jul 13, 2022

1.1.16

Jul 12, 2022

1.1.15

Jul 12, 2022

1.1.14

Jul 11, 2022

1.1.13

Jul 9, 2022

1.1.12

Jul 8, 2022

1.1.11

Jul 7, 2022

1.1.10

Jul 5, 2022

1.1.9

Jul 4, 2022

1.1.8

Jul 2, 2022

1.1.7

Jul 2, 2022

1.1.6

Jun 30, 2022

1.1.5

Jun 29, 2022

1.1.4

Jun 28, 2022

1.1.3

Jun 26, 2022

1.1.2

Jun 26, 2022

1.1.1

Jun 24, 2022

1.1.0

Jun 23, 2022

1.0.26

Jun 22, 2022

1.0.25

Jun 22, 2022

1.0.24

Jun 21, 2022

1.0.23

Jun 17, 2022

1.0.22

Jun 16, 2022

1.0.21

Jun 16, 2022

1.0.20

Jun 15, 2022

1.0.19

Jun 14, 2022

1.0.18

Jun 14, 2022

1.0.17

Jun 14, 2022

1.0.16

Jun 13, 2022

1.0.14

Jun 10, 2022

1.0.13

May 30, 2022

1.0.12

May 29, 2022

1.0.11

May 27, 2022

1.0.10

May 23, 2022

1.0.9

May 18, 2022

1.0.8

May 17, 2022

1.0.7

May 16, 2022

1.0.6

May 12, 2022

1.0.5

May 12, 2022

1.0.4

May 6, 2022

1.0.3

Apr 22, 2022

1.0.2

Apr 13, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydflow-1.6.6.tar.gz (99.5 kB view hashes)

Uploaded Nov 18, 2022 Source

Built Distribution

pydflow-1.6.6-py3-none-any.whl (93.7 kB view hashes)

Uploaded Nov 18, 2022 Python 3

Hashes for pydflow-1.6.6.tar.gz

Hashes for pydflow-1.6.6.tar.gz
Algorithm	Hash digest
SHA256	`7f2373771f5f8351c4cc057623b1a57d32ae3b40258bfdd7c841f85233987e59`
MD5	`c68ba78d28b26d4610e363d86fc04b66`
BLAKE2b-256	`cf93d857cd97d4eae681972dfc44b1cd73faef7888e9508e3a2ebdb34bf0090a`

Hashes for pydflow-1.6.6-py3-none-any.whl

Hashes for pydflow-1.6.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7aee553c4190e2ef2ce858533bf555d5db98b68604d851cec6d391e6716e9ce4`
MD5	`b7a1f76342aff77008433578bec41796`
BLAKE2b-256	`c78bd7d4959e025d7fa4684e7f2c4caea118a215f49f6a4e61b9e5908642849d`

pydflow 1.6.6

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

DFLOW

1. Overview

1.1. Architecture

1.2. Common layer

1.2.1. Parameters and artifacts

1.2.2. OP template

1.2.3. Step

1.2.4. Workflow

1.3. Interface layer

1.3.1. Python OP

2. Quick Start

2.1. Prepare Kubernetes cluster

2.2. Setup Argo Workflows

2.3. Install dflow

2.4. Run an example

3. User Guide (dflow-doc)

3.1. Common layer

3.1.1. Workflow management

3.1.2. Upload/download artifact

3.1.3. Steps

3.1.4. DAG

3.1.5. Output parameters and artifacts of Steps

3.1.6. Conditional step, parameters and artifacts

3.1.7. Produce parallel steps using loop

3.1.8. Timeout

3.1.9. Continue on failed

3.1.10. Continue on success number/ratio of parallel steps

3.1.11. Optional input artifacts

3.1.12. Default value for output parameters

3.1.13. Key of a step

3.1.14. Resubmit a workflow

3.1.15. Executor

3.1.16. Submit Slurm job via slurm executor

3.1.17. Submit HPC job via dispatcher plugin

3.1.18. Submit Slurm job via virtual node

3.1.19. Use resources in Kubernetes

3.1.20. Important note: variable names

3.1.21. Debug mode: dflow independent of Kubernetes

3.2. Interface layer

3.2.1. Slices

3.2.2. Retry and error handling

3.2.3. Progress

3.2.4. Upload python packages for development

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution