Skip to main content

Yet another pythonic workflow engine

Project description

build

flowfish

Yet another pythonic workflow engine.

🐍 Installation

  • Operating system: macOS / OS X · Linux · Windows
  • Python version: Python 3.7+
  • Package managers: [pip]
pip install flowfish

✨ Getting started

This getting started tutorial demonstrates how things work. It is not a real world example, we just want to sum up some numbers.

First we define a function add() with two arguments a and b, that are later added together. Then we assign it to a node named sum using a JSON config. And finally we set the default values for a and b to 3 and 4. Remember, it is just an example.

from flowfish import flow

def add(a, b):
    return a + b

f = flow({
    "math": {
        "sum@add": {
            "a": 3,
            "b": 4
        }
    }
})

Now we call the sum() function and the default values for a and b are applied implicitly.

f.math.sum()

👉 7

Now we call sum() with a and b set explicitly.

f.math.sum(5, 6)

👉 11

Now we replace our custom add() function with Python's built-in sum() function, which has a slightly different signature: sum(iterable, start=0).

from flowfish import flow

f = flow({
    "math": {
        "sum": {
            "iterable": [3, 4]
        }
    }
})
f.math.sum()

👉 7

Now we connect some nodes together and build our first flow. As already mentioned, a node is actually a Python function. So when we connect nodes together, we connect functions together. If we want to connect a node with a value, we can just assign the value to a node parameter or we can use the built-in flow function map() that takes the value as input and simply returns it.

from flowfish import flow

f = flow({
    "math": {
        "number_one@map": {
            "input": 3
            
        },
        "number_two@map": {
            "input": 4
        },
        "sum": {
            "iterable": ["@number_one", "@number_two"]
        }
    }
})
f.math.sum()

👉 7

Now we visualize the flow graph.

-f.math.sum

svg

📚 Usage

Scopes and Nodes

The flow is configured in JSON format and consists of scopes and nodes. A scope is a group of nodes and a node is just an alias for a pure Python function.

A basic flow configuration looks like this:

{
    "scope": {
        "node": {
        }
    }
}
  • scope and node names may only contain ASCII letters, digits or underscores
  • config keys starting with # are considered comments and therefore ignored, this is usefull for temporarily disabling nodes or scopes

Scope and node inheritance

Scopes and nodes can inherit their properties from other scopes and nodes by useing the @ notation.

{
    "example": {
        "foo": {
        },
        "bar@foo": {
        }
    }
}

A scope can inherit from:

  • another scope from the current flow
  • another scope from an external config file, e.g. "../foo.json#foo"

A node can inherit from:

  • another node from the current scope
  • a function from some Python module, e.g. "sklearn.model_selection.train_test_split"
  • a function from the __main__ module
  • a built-in Python function, e.g. "open"
  • a class (here the constructor is considered as function), e.g. "foo.bar.FooBar"

Property assignment

Nodes can get their property values from the return values of other nodes.

Function results

A leading @ assigns the return value of a node function to a node property.

{
    "example": {
        "foo": {
        },
        "bar": {
            "foo": "@foo"
        }
    }
}

Paths

A / after the node name assigns the node path. Additionally another path can be appended.

{
    "example": {
        "foo": {
        },
        "bar": {
            "path": "@foo/test.text"
        }
    }
}

References

A leading & assigns a reference to the node function itself as opposed to the result value.

{
    "example": {
        "foo": {
        },
        "bar": {
            "path": "&foo"
        }
    }
}

Quoting

String literals starting with the reserved character @ or & must be quoted by appending the same character again (e.g. @@ or &&).

Result caching

Node results are cached by default. Nodes with the _dump property set are pickled to a .dump file and must not be reprocessed again when called later.

Progress bars

Nodes with the _tqdm property set are wrapped with a tqdm progress bar if their functions are valid python generators.

Property overriding

{
    "example": {
        "_props": {
            "tokenizer.language": "klingon",
            "analyzer.language": "klingon"
        },
        ,
        "tokenizer": {
        },
        "analyzer": {
        }
    }
}

Flow command line tool

% flow
usage: flow [-h] {run,agent,push,pull,prune} ...

optional arguments:
  -h, --help            show this help message and exit

command:
  {run,agent,push,pull,prune}
    run                 run flow
    agent               start agent
    push                push data to sync_dir
    pull                pull data from sync_dir
    prune               prune files in data_dir

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

flowfish-1.0.1-py3-none-any.whl (39.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page