Skip to main content

GitHub Actions compute backend for Metaflow

Project description

metaflow-gha

CI E2E PyPI License: Apache-2.0 Python 3.10+

Run your Metaflow steps on free GitHub Actions VMs without changing your flow code.

The problem

Running Metaflow on AWS Batch or Kubernetes requires cloud accounts, IAM roles, container registries, and billing. For personal projects, side work, or early prototyping, the setup cost is disproportionate to the compute need. GitHub Actions gives you 20 parallel Ubuntu VMs for free on public repos — but there's no way to use them as a Metaflow compute backend.

Quick start

pip install metaflow-gha

One-time setup in your flow's repo:

python flow.py gha inject   # writes .github/workflows/metaflow-gha.yml
git add .github/workflows/metaflow-gha.yml
git commit -m "chore: add metaflow-gha caller workflow"
git push

Then run any flow on GHA:

python flow.py run --with=gha

Prerequisites

  • Python 3.10+
  • metaflow with --datastore=s3
  • An S3-compatible bucket (AWS S3, Cloudflare R2, MinIO, etc.) reachable from GitHub Actions
  • gh CLI authenticated with gh auth login (or GH_TOKEN set)
  • AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) for your S3 bucket, configured as GitHub repo secrets

Install

pip install metaflow-gha

Usage

Apply to the whole flow via CLI:

python flow.py --datastore=s3 run --with=gha

Apply to specific steps with the decorator:

from metaflow import FlowSpec, step, gha

class MyFlow(FlowSpec):
    @gha
    @step
    def heavy_step(self):
        # runs on a GitHub Actions VM
        self.result = expensive_computation()
        self.next(self.end)

    @step
    def end(self):
        print(self.result)

if __name__ == "__main__":
    MyFlow()

Control parallelism, timeout, and retries:

@gha(workers=10, timeout=3600, max_retries=1)
@step
def train(self):
    ...
Parameter Default Description
workers 20 Number of parallel GHA runner VMs to spin up.
timeout 21600 Max wall-clock seconds per task (GHA limit: 6 hours).
max_retries 2 Times to retry a failed task before marking it permanently failed.

How it works

When @gha is active:

  1. Each step task is pushed to an S3-backed queue instead of running locally.
  2. Up to workers GitHub Actions jobs are dispatched once per run via gh workflow run.
  3. Each GHA worker pulls a task, downloads the code package from S3, installs requirements.txt, and runs python flow.py step ... — standard Metaflow execution.
  4. Workers use S3 conditional writes (IfNoneMatch: *) for atomic task claiming — no races with 20 concurrent workers.
  5. Task logs appear in the Metaflow UI via mflog (same mechanism as Batch and Kubernetes backends).
  6. The gha step orchestrator polls for task completion and streams logs back in real time.

The queue lives entirely in your Metaflow S3 datastore — no extra infrastructure beyond an S3 bucket.

Configuration

Environment variables

Variable Default Description
METAFLOW_DATASTORE_SYSROOT_S3 Required. Your Metaflow S3 root (e.g. s3://my-bucket/metaflow).
METAFLOW_GHA_USER_REPO inferred from git remote GitHub repo to dispatch workers on (e.g. myorg/myrepo).
METAFLOW_GHA_WORKER_REPO npow/metaflow-gha Repo hosting the reusable worker workflow.
METAFLOW_GHA_WORKER_WORKFLOW worker.yml Reusable workflow filename in the worker repo.
METAFLOW_GHA_WORKER_REF main Git ref of the worker repo to check out (pin to a tag/SHA for stability).
METAFLOW_GHA_CALLER_WORKFLOW metaflow-gha.yml Workflow filename in your repo (written by gha inject).
METAFLOW_GHA_DISPATCH_REF "" Branch/tag in your repo to dispatch the caller workflow on.

GitHub repo secrets

AWS credentials for the worker VMs are passed via repository secrets. Set them once with the gh CLI:

gh secret set AWS_ACCESS_KEY_ID     --body "$AWS_ACCESS_KEY_ID"
gh secret set AWS_SECRET_ACCESS_KEY --body "$AWS_SECRET_ACCESS_KEY"
# Optional: for S3-compatible endpoints (Cloudflare R2, MinIO, etc.)
gh variable set AWS_ENDPOINT_URL_S3 --body "https://your-endpoint"
gh variable set METAFLOW_DATASTORE_SYSROOT_S3 --body "s3://your-bucket/metaflow"

Or let gha step sync them automatically the first time you run (when gh is authenticated locally with secrets:write permission).

S3-compatible storage

Any S3-compatible store works — Cloudflare R2, MinIO, Backblaze B2, etc. Set AWS_ENDPOINT_URL_S3 (or METAFLOW_S3_ENDPOINT_URL) on both your local machine and as a GitHub repo variable. GHA runners have full internet egress and can reach any public S3-compatible endpoint.

Development

git clone https://github.com/npow/metaflow-gha
cd metaflow-gha
pip install -e ".[dev]"
pytest -v

CI runs on Python 3.10, 3.11, and 3.12. E2E runs on GitHub Actions against a real S3 datastore.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaflow_gha-0.1.0.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metaflow_gha-0.1.0-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file metaflow_gha-0.1.0.tar.gz.

File metadata

  • Download URL: metaflow_gha-0.1.0.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metaflow_gha-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ce5f47a06b49088f86551581d30d470ddf9efb92d0ae9780f4afea5adf600b45
MD5 9cc15825b181ab084823a38f690af35c
BLAKE2b-256 f78d4420a3aec8e3a114d4b6b0bee6a552ab07b999e4be32cb266fca2ae1af24

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaflow_gha-0.1.0.tar.gz:

Publisher: publish.yml on npow/metaflow-gha

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file metaflow_gha-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: metaflow_gha-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metaflow_gha-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bc0465a106a881c4d87740234168110d8d302c3351ffb184b8b5ebbf96f900ae
MD5 03a26cabcdd520385e8a552d62923b08
BLAKE2b-256 7ef7add0319dbfd784007e1cd40304d5d0e4935cbf30cc9869369e23699a0574

See more details on using hashes here.

Provenance

The following attestation bundles were made for metaflow_gha-0.1.0-py3-none-any.whl:

Publisher: publish.yml on npow/metaflow-gha

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page