Skip to main content

Run Metaflow flows as Windmill workflows

Project description

metaflow-windmill

CI PyPI License Python Docs

Run any Metaflow flow on Windmill without rewriting your pipeline.

The problem

Windmill is a powerful open-source orchestrator, but its native scripting model has no concept of Metaflow steps, data artifacts, or retry semantics. You either maintain two separate codebases — one for Metaflow development and one for Windmill production — or you lose Metaflow's versioning, lineage, and @retry guarantees entirely. There is no off-the-shelf bridge that compiles a Metaflow flow into a Windmill workflow while preserving all of its runtime behavior.

Quick start

pip install metaflow-windmill
python flow.py windmill create --windmill-host http://localhost:8000 --windmill-token $TOKEN
python flow.py windmill trigger --windmill-host http://localhost:8000 --windmill-token $TOKEN
# Deploying HelloFlow...
# Flow deployed successfully to Windmill.
# Job started: http://localhost:8000/run/01927f3a-...?workspace=admins

Install

pip install metaflow-windmill

From source:

git clone https://github.com/npow/metaflow-windmill
cd metaflow-windmill
pip install -e .

Usage

Deploy and trigger in one step:

python flow.py windmill run \
  --windmill-host http://localhost:8000 \
  --windmill-token $TOKEN \
  --windmill-workspace admins
# Compiling HelloFlow...
# Flow deployed successfully to Windmill.
# Triggering execution...
# Job started: http://localhost:8000/run/01927f3a-...?workspace=admins
# Job is running...
# Job 01927f3a-... completed successfully.

Deploy once, trigger many times with parameters:

python flow.py windmill create --windmill-token $TOKEN
python flow.py windmill trigger --windmill-token $TOKEN \
  --run-param message=hello --run-param iterations=5

Programmatic API:

from metaflow import Deployer

with Deployer("flow.py") as d:
    df = d.windmill().create(
        windmill_host="http://localhost:8000",
        windmill_token="my-token",
    )
    run = df.trigger(message="hello")
    print(run.status)          # RUNNING / SUCCEEDED / FAILED
    print(run.windmill_ui)     # http://localhost:8000/run/<job-id>?workspace=admins
    print(run.run.successful)  # True

How it works

Each Metaflow step becomes a Windmill flow module that runs a bash script. The bash script calls python flow.py step <step_name> with --run-id, --task-id, and --retry-count derived from Windmill's native WM_FLOW_RETRY_COUNT environment variable. Branch/join steps compile to Windmill branchall modules; foreach steps compile to forloopflow modules with a configurable max_workers concurrency limit. The Metaflow run ID is pre-computed before triggering and threaded through every step so Metaflow's local datastore and the Windmill job ID stay in sync.

Supported graph patterns:

Metaflow pattern Windmill module
Linear steps rawscript sequence
Split / join (static branches) branchall
ForEach forloopflow
Nested ForEach nested forloopflow
@condition (conditional split) branchone

For @condition splits, the split step emits {"branch": "<step_name>"} to stdout after running. The subsequent branchone module uses results.<split_step>.branch === '<step_name>' as the predicate so Windmill routes to the correct branch at runtime.

@parallel and @batch are not supported — Windmill runs each step as a local subprocess on the worker.

Configuration

CLI option Environment variable Default Description
--windmill-host WINDMILL_HOST http://localhost:8000 Windmill server base URL
--windmill-token WINDMILL_TOKEN Windmill API token
--windmill-workspace WINDMILL_WORKSPACE admins Workspace name
--max-workers 10 Max concurrent ForEach body tasks
--branch @project branch name
--production false Deploy to the production project branch
--name derived from class name Override the Windmill flow path

Development

git clone https://github.com/npow/metaflow-windmill
cd metaflow-windmill
pip install -e ".[dev]"
pytest tests/

Integration tests require a running Windmill instance:

docker compose up -d
pytest tests/ -m integration

License

Apache 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaflow_windmill-0.2.1.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metaflow_windmill-0.2.1-py3-none-any.whl (25.2 kB view details)

Uploaded Python 3

File details

Details for the file metaflow_windmill-0.2.1.tar.gz.

File metadata

  • Download URL: metaflow_windmill-0.2.1.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metaflow_windmill-0.2.1.tar.gz
Algorithm Hash digest
SHA256 c2e3ceb6314be1a02a279b91bfe8930d2ae4bd376a372542373aa0a974645e6d
MD5 c16e735d6ef994a050216808ff45de13
BLAKE2b-256 b39330c5297a385ac3235ad07c0dc312e859aad606e54b88a2f6a27c4fc2f9b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaflow_windmill-0.2.1.tar.gz:

Publisher: publish.yml on npow/metaflow-windmill

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file metaflow_windmill-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for metaflow_windmill-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d04bfe5e261f98a1a37d1baecbcc762b49e0753d2b6d86e5e95d2583babf51e6
MD5 33ce726dc2ec4d445693c4be75581304
BLAKE2b-256 dac892f6d8fc81c94199ed730f63194ab33dc6d574906964f0c6c8f1462e615e

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaflow_windmill-0.2.1-py3-none-any.whl:

Publisher: publish.yml on npow/metaflow-windmill

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page