Skip to main content

Run Metaflow flows as Windmill workflows

Project description

metaflow-windmill

CI PyPI License Python

Run any Metaflow flow on Windmill without rewriting your pipeline.

The problem

Windmill is a powerful open-source orchestrator, but its native scripting model has no concept of Metaflow steps, data artifacts, or retry semantics. You either maintain two separate codebases — one for Metaflow development and one for Windmill production — or you lose Metaflow's versioning, lineage, and @retry guarantees entirely. There is no off-the-shelf bridge that compiles a Metaflow flow into a Windmill workflow while preserving all of its runtime behavior.

Quick start

pip install metaflow-windmill
python flow.py windmill create --windmill-host http://localhost:8000 --windmill-token $TOKEN
python flow.py windmill trigger --windmill-host http://localhost:8000 --windmill-token $TOKEN
# Deploying HelloFlow...
# Flow deployed successfully to Windmill.
# Job started: http://localhost:8000/run/01927f3a-...?workspace=admins

Install

pip install metaflow-windmill

From source:

git clone https://github.com/npow/metaflow-windmill
cd metaflow-windmill
pip install -e .

Usage

Deploy and trigger in one step:

python flow.py windmill run \
  --windmill-host http://localhost:8000 \
  --windmill-token $TOKEN \
  --windmill-workspace admins
# Compiling HelloFlow...
# Flow deployed successfully to Windmill.
# Triggering execution...
# Job started: http://localhost:8000/run/01927f3a-...?workspace=admins
# Job is running...
# Job 01927f3a-... completed successfully.

Deploy once, trigger many times with parameters:

python flow.py windmill create --windmill-token $TOKEN
python flow.py windmill trigger --windmill-token $TOKEN \
  --run-param message=hello --run-param iterations=5

Programmatic API:

from metaflow import Deployer

with Deployer("flow.py") as d:
    df = d.windmill().create(
        windmill_host="http://localhost:8000",
        windmill_token="my-token",
    )
    run = df.trigger(message="hello")
    print(run.status)          # RUNNING / SUCCEEDED / FAILED
    print(run.windmill_ui)     # http://localhost:8000/run/<job-id>?workspace=admins
    print(run.run.successful)  # True

How it works

Each Metaflow step becomes a Windmill flow module that runs a bash script. The bash script calls python flow.py step <step_name> with --run-id, --task-id, and --retry-count derived from Windmill's native WM_FLOW_RETRY_COUNT environment variable. Branch/join steps compile to Windmill branchall modules; foreach steps compile to forloopflow modules with a configurable max_workers concurrency limit. The Metaflow run ID is pre-computed before triggering and threaded through every step so Metaflow's local datastore and the Windmill job ID stay in sync.

Supported graph patterns:

Metaflow pattern Windmill module
Linear steps rawscript sequence
Split / join (static branches) branchall
ForEach forloopflow
Nested ForEach nested forloopflow
@condition (conditional split) branchone

For @condition splits, the split step emits {"branch": "<step_name>"} to stdout after running. The subsequent branchone module uses results.<split_step>.branch === '<step_name>' as the predicate so Windmill routes to the correct branch at runtime.

@parallel and @batch are not supported — Windmill runs each step as a local subprocess on the worker.

Configuration

CLI option Environment variable Default Description
--windmill-host WINDMILL_HOST http://localhost:8000 Windmill server base URL
--windmill-token WINDMILL_TOKEN Windmill API token
--windmill-workspace WINDMILL_WORKSPACE admins Workspace name
--max-workers 10 Max concurrent ForEach body tasks
--branch @project branch name
--production false Deploy to the production project branch
--name derived from class name Override the Windmill flow path

Development

git clone https://github.com/npow/metaflow-windmill
cd metaflow-windmill
pip install -e ".[dev]"
pytest tests/

Integration tests require a running Windmill instance:

docker compose up -d
pytest tests/ -m integration

License

Apache 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaflow_windmill-0.2.0.tar.gz (27.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metaflow_windmill-0.2.0-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file metaflow_windmill-0.2.0.tar.gz.

File metadata

  • Download URL: metaflow_windmill-0.2.0.tar.gz
  • Upload date:
  • Size: 27.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metaflow_windmill-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4c0363fb52817bf1e29518e32fd603c00cfe3f8e2e1b2b55edeb04af204875c3
MD5 7ebe9b677908a1d94e7eac3fd38d8994
BLAKE2b-256 05b65f7e5241f25fe9df4a9cc2c129534a112f520c4fc8ad377af18396ccda3c

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaflow_windmill-0.2.0.tar.gz:

Publisher: publish.yml on npow/metaflow-windmill

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file metaflow_windmill-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for metaflow_windmill-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e5376e461163cb1138ac2afa69463f06e0d905bb3857fba5289902ed72b20c05
MD5 e466302258c6a745589b451e2509fef2
BLAKE2b-256 945d3f06d4df0c2ea9d515a1dfed7bbcf0d0eb1e8a858d2bf995055926975beb

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaflow_windmill-0.2.0-py3-none-any.whl:

Publisher: publish.yml on npow/metaflow-windmill

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page