Run Metaflow flows as Flyte workflows
Project description
metaflow-flyte
Schedule and monitor your Metaflow pipelines through Flyte without rewriting them.
The problem
You've built pipelines in Metaflow and now need Flyte's scheduling, UI, and observability — but rewriting your flows in Flytekit means losing Metaflow's versioning, artifact store, and local execution model. Running both side-by-side means maintaining two copies of every pipeline.
Quick start
pip install metaflow-flyte
# Generate the Flyte workflow file
python my_flow.py --datastore=s3 flyte create my_flow_remote.py \
--image my-registry/my-image:latest
# Run locally (no cluster required)
pyflyte run my_flow_remote.py my_flow
# Register and run on a Flyte cluster
pyflyte register --project flytesnacks --domain development my_flow_remote.py
pyflyte run --remote --project flytesnacks --domain development \
my_flow_remote.py my_flow
Install
pip install metaflow-flyte
Or from source:
git clone https://github.com/npow/metaflow-flyte.git
cd metaflow-flyte
pip install -e ".[dev]"
Usage
Generate and register a Flyte workflow
python my_flow.py --datastore=s3 flyte create my_flow_remote.py \
--image my-registry/my-image:latest
pyflyte register --project myproject --domain development my_flow_remote.py
All graph shapes are supported
# Linear
class SimpleFlow(FlowSpec):
@step
def start(self):
self.value = 42
self.next(self.end)
@step
def end(self): pass
# Split/join (branch)
class BranchFlow(FlowSpec):
@step
def start(self):
self.next(self.branch_a, self.branch_b)
...
# Foreach fan-out
class ForeachFlow(FlowSpec):
@step
def start(self):
self.items = [1, 2, 3]
self.next(self.process, foreach="items")
...
Parametrised flows
Parameters defined with metaflow.Parameter are forwarded automatically as Flyte workflow
arguments:
python param_flow.py --datastore=s3 flyte create param_flow_remote.py \
--image my-registry/my-image:latest
Pass parameters at runtime:
pyflyte run --remote ... param_flow_remote.py param_flow --greeting "Hello"
Step decorators (--with)
Inject Metaflow step decorators at deploy time without modifying the flow source:
python my_flow.py --datastore=s3 flyte create my_flow_remote.py \
--image my-registry/my-image:latest \
--with=kubernetes:cpu=4,memory=8000
Retries
@retry on any step is picked up automatically. The generated Flyte task gets the corresponding
retries parameter:
class MyFlow(FlowSpec):
@retry(times=3)
@step
def train(self):
...
Scheduled flows
If your flow has a @schedule decorator, the generated file includes a Flyte LaunchPlan with
the corresponding cron schedule automatically.
Project namespace
If the flow uses @project(name=...), the Flyte project is automatically used:
@project(name="recommendations")
class TrainFlow(FlowSpec): ...
How it works
metaflow-flyte compiles your Metaflow flow's DAG into a self-contained Flyte workflow file.
Each Metaflow step becomes a @task. The generated file:
- runs each step as a subprocess via the standard
metaflow stepCLI - derives a stable
run_idfrom the Flyte execution ID so all steps share one Metaflow run - passes
--input-pathscorrectly for joins and foreach fan-outs - emits Metaflow artifact retrieval snippets to the Flyte UI Deck after each step
Executions list
All workflow executions are visible in the Flyte console with status, duration, and launch plan:
Linear flow
A simple 3-step linear flow (start → process → end) runs as 4 Flyte tasks — one to generate
the shared run ID, then one per Metaflow step:
Branch flow
Branch flows with split/join (start → branch_a + branch_b → join → end) run the parallel
steps concurrently as separate Flyte tasks:
Foreach flow
Foreach fan-outs use a Flyte @dynamic task to spawn one task per item at runtime. The
_foreach_*_dynamic Sub-Workflow node fans out and collects results:
Task detail and Flyte Deck
Click any task in the execution view to open the detail panel. Each Metaflow step produces a Flyte Deck accessible via the "Flyte Deck" button:
Metaflow artifact retrieval
The metaflow tab in the Flyte Deck shows the exact Python code to retrieve each artifact
from this specific task — using the full FlowName/run_id/step_name/task_id pathspec:
For tasks that produce multiple artifacts, each one is listed with its access expression:
Parametrised flows show the parameter values alongside the artifacts:
# Retrieve artifacts from any completed Metaflow step
from metaflow import Task
task = Task('LinearFlow/flyte-atlw559q7zhbg2mw92sq/process/62e7a9511b5646b6')
task.data.message # access the 'message' artifact
task.data.result # access the 'result' artifact
Configuration
The generated file bakes in datastore and image settings at creation time so every task subprocess uses the same configuration.
# Use S3 datastore with a custom image
python my_flow.py \
--datastore=s3 \
flyte create my_flow_remote.py \
--image my-registry/my-image:latest \
--project flytesnacks \
--domain development
Docker image requirements
Your Docker image must contain:
- The flow Python file at the same absolute path as on your local machine
- All Python dependencies (
metaflow,flytekit,boto3, etc.) USERNAMEenvironment variable set (e.g.ENV USERNAME=metaflow)- S3/datastore credentials and endpoint configuration
Example Dockerfile:
FROM python:3.11-slim
RUN pip install "metaflow>=2.9" "flytekit>=1.10" "boto3>=1.26"
COPY my_flow.py /path/to/my_flow.py
ENV USERNAME=metaflow
Development
git clone https://github.com/npow/metaflow-flyte.git
cd metaflow-flyte
pip install -e ".[dev]"
# Compilation tests only (fast, ~25s)
pytest tests/ -m "not integration and not e2e"
# Compilation + local pyflyte run (~3 min)
pytest tests/ -m "not e2e"
# E2e (requires a running Flyte cluster)
pytest tests/ -m "e2e"
The test suite covers three tiers:
- Tier 1: compile all graph shapes → assert generated file content
- Tier 2:
pyflyte runlocally → verify Metaflow artifacts written to disk - Tier 3: register + run on a live Flyte cluster → verify S3 artifacts and deck output
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file metaflow_flyte-0.3.2.tar.gz.
File metadata
- Download URL: metaflow_flyte-0.3.2.tar.gz
- Upload date:
- Size: 36.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ccbb7052cf0ca842de5b4a87ed8504ef71dcb6f71f360748b95d759a584e77ea
|
|
| MD5 |
f0ed9d4ee44b8d60beefa2297b1f4e42
|
|
| BLAKE2b-256 |
058184bede35480bdea5cb2a66e5c8c149921becd7bb6be200005a921a91ddbb
|
Provenance
The following attestation bundles were made for metaflow_flyte-0.3.2.tar.gz:
Publisher:
publish.yml on npow/metaflow-flyte
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
metaflow_flyte-0.3.2.tar.gz -
Subject digest:
ccbb7052cf0ca842de5b4a87ed8504ef71dcb6f71f360748b95d759a584e77ea - Sigstore transparency entry: 1052584556
- Sigstore integration time:
-
Permalink:
npow/metaflow-flyte@745d5b4cd0ec994ffe0552066dadf41925059329 -
Branch / Tag:
refs/tags/v0.3.2 - Owner: https://github.com/npow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@745d5b4cd0ec994ffe0552066dadf41925059329 -
Trigger Event:
push
-
Statement type:
File details
Details for the file metaflow_flyte-0.3.2-py3-none-any.whl.
File metadata
- Download URL: metaflow_flyte-0.3.2-py3-none-any.whl
- Upload date:
- Size: 32.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
74152c4197c83aa1ca1264b3bf9e99922d7378dc69522e748692770bb4f34019
|
|
| MD5 |
ba599025914e216c49ad3328e5289001
|
|
| BLAKE2b-256 |
64ae77129b5d81bded0ac8baa817e216215459cd740d7916e89d6a0287661adb
|
Provenance
The following attestation bundles were made for metaflow_flyte-0.3.2-py3-none-any.whl:
Publisher:
publish.yml on npow/metaflow-flyte
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
metaflow_flyte-0.3.2-py3-none-any.whl -
Subject digest:
74152c4197c83aa1ca1264b3bf9e99922d7378dc69522e748692770bb4f34019 - Sigstore transparency entry: 1052584608
- Sigstore integration time:
-
Permalink:
npow/metaflow-flyte@745d5b4cd0ec994ffe0552066dadf41925059329 -
Branch / Tag:
refs/tags/v0.3.2 - Owner: https://github.com/npow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@745d5b4cd0ec994ffe0552066dadf41925059329 -
Trigger Event:
push
-
Statement type: