Skip to main content

A simple library for building complex workflows.

Project description

Workfloz

A simple library for building complex workflows.


Workfloz is meant to be very easy to use, abstracting away most of the complexity one needs to deal with when building Workflows. This is done through the use of extensions, where the complexity resides, and through a clean and easy to learn syntax.

Installing

pip install workfloz

Vision

Although Workfloz is built to be a general-purpose tool, the first set of extensions will be about machine learning. Once stable, the library should be able to run the following code:

# 1. Instantiate tools provided by extension
loader = CSVLoader("loader", file="data.csv") # Set as concrete directly.
processor = DataProcessor("processor")
processors = Pipeline("processors", processor.remove_duplicates())
builder = Abstract("builder") # Set as abstract and set concrete later.
trainer = ModelTrainer("trainer", auto_from=builder) # Automatically choose right trainer based on builder.
mlf_logger = MLFlowLogger("mlflogger", url="http://...")
file_logger = FileLogger("filelogger", dir="logs/")

# 2. Build workflow template
with Job("Machine Learning") as ML:

    with Task("prepare inputs", mode="async"): # 'async' applies on a line basis
        loader.load() | processors.run() > trainer.data
        builder.build() > trainer.model
    
    with Task("train", mode="async"):
        trainer.train()
        when("training_started", trainer) >> [mlf_logger.log_parameters(), file_logger.log_parameters()]
        when("epoch_ended", trainer) >> [mlf_logger.log_metrics(), file_logger.log_metrics()]
        when("training_ended", trainer) >> [mlf_logger.log_model(), file_logger.log_model()]
              
# 3. Define different Workflows from base template above.
forest10 = Job("forest-10", blueprint=ML)
# Set missing concrete strategies
forest10["builder"] = SKLForestBuilder(num_estimators=10)

forest50 = Job("forest-50", blueprint=ML)
forest50["builder"] = SKLForestBuilder(num_estimators=50)

forest50-scaled = Job("forest-50s", blueprint=forest50)
# Add processor to Pipeline
processors.then(processor.Scale())

# 4. Start workflows	
forest10.start()
forest50.start()
forest50s.start()

In pratice, 1 and 2 could be provided by the extension. The end user would only need to define 3 and 4. Extensions for Scikit learn, HuggingFace and MLFlow are planned.

Status of current version

The library is under active development but it will take some time before the example above can run. The API is not to be considered stable before v1.0.0 is released.
The following example is already possible though (available in '/examples'):

import pandas as pd

from workfloz import ActionContainer
from workfloz import Job
from workfloz import Parameter
from workfloz import result
from workfloz import StringValidator


# Define tool
class CSVLoader(ActionContainer):  # Every method becomes an 'Action'
    """Return a pandas DataFrame from a CSV file."""

    # Attributes can be validated and documented
    file: str = Parameter(
        doc="The relative or absolute path.", validators=[StringValidator(max_len=50)]
    )
    separator: str = Parameter(default=",")

    def load(
        self, file, separator
    ):  # arguments will be filled in from above if not specified in call.
        return pd.read_csv(file, sep=separator)


# Instantiate tool
loader = CSVLoader("loader", file="iris.csv")
assert loader.file == "iris.csv"  # Attribute file is set on loader

# Define workflow
with Job("load data") as job:
    # A call to an 'Action' is recorded and will be executed on 'start'
    data = loader.load()
    # data = loader.load(separator=";") # Attr. could be overriden, only for this call

# start Job and check result
job.start()
print(result(data))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

workfloz-0.1.0.tar.gz (28.2 kB view details)

Uploaded Source

Built Distribution

workfloz-0.1.0-py3-none-any.whl (22.8 kB view details)

Uploaded Python 3

File details

Details for the file workfloz-0.1.0.tar.gz.

File metadata

  • Download URL: workfloz-0.1.0.tar.gz
  • Upload date:
  • Size: 28.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for workfloz-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e09321f0c2fe0dd652c73bba0298264c5b544b8362b6d0765b66efc72a651160
MD5 779d3c9005b4b0d6af02d39373911ea3
BLAKE2b-256 68ec817c9d437d3c162be1f8cce9efaf8e0bc9570dca74ed193ff646421b0a60

See more details on using hashes here.

File details

Details for the file workfloz-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: workfloz-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for workfloz-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1882169d733fb189502bd14dc5f1c1cdb4a8889b74d84c25dba7478f11ee69e9
MD5 45d4ef3e8e4b323a7c13de80da0aa855
BLAKE2b-256 633b46e31c9c9af59d5c289af312af21642d0709540faf54e21ef3d7f797fd4d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page