Metaflow extension: pre-bake conda environments into Docker images for fast cold starts
Project description
metaflow-prebuilt
A Metaflow extension that pre-bakes conda/PyPI dependencies into Docker images at deploy time. Tasks start immediately — no conda bootstrap overhead at runtime.
How it works
When you deploy a flow with --environment=prebuilt, the extension:
- Resolves the conda/PyPI environment spec for each step.
- Builds a Docker image with the environment already installed.
- Pushes the image to a registry.
- Configures the remote runner (Batch, Kubernetes, etc.) to pull that image instead of bootstrapping conda at task start.
Install
pip install metaflow-prebuilt
Optional extras for specific backends:
pip install 'metaflow-prebuilt[ecr]' # ECR registry + CodeBuild service
pip install 'metaflow-prebuilt[gcr]' # GCR registry
pip install 'metaflow-prebuilt[kaniko]' # Kaniko build service (K8s)
Quick start
from metaflow import FlowSpec, step, conda
class MyFlow(FlowSpec):
@conda(packages={"numpy": "1.26.4"})
@step
def start(self):
import numpy
print(numpy.__version__)
self.next(self.end)
@step
def end(self):
pass
if __name__ == "__main__":
MyFlow()
Deploy with the prebuilt environment (requires a configured registry and build service — see below):
# Deploy: builds and pushes Docker images, then submits the flow
python my_flow.py --environment=prebuilt run
# Or on Batch/Kubernetes:
python my_flow.py --environment=prebuilt batch run
Configuration
Select backends via environment variables:
| Variable | Default | Purpose |
|---|---|---|
METAFLOW_PREBUILT_BUILD_SERVICE |
docker |
Which build service to use |
METAFLOW_PREBUILT_IMAGE_REGISTRY |
dockerhub |
Which image registry to use |
Extension points
DockerBuildService
Controls how Docker images are built and pushed. Select with
METAFLOW_PREBUILT_BUILD_SERVICE=<name>.
| Name | Class | Description |
|---|---|---|
docker |
LocalDockerBuildService |
Local Docker daemon (docker build + docker push) |
buildx |
BuildxBuildService |
Docker Buildx (docker buildx build --push), supports remote builders and cross-platform |
kaniko |
KanikoBuildService |
Kaniko in a Kubernetes Job; uploads context to GCS/S3 |
codebuild |
CodeBuildService |
AWS CodeBuild; uploads context to S3, polls for completion |
ImageRegistry
Controls where images are stored. Select with
METAFLOW_PREBUILT_IMAGE_REGISTRY=<name>.
| Name | Class | Description |
|---|---|---|
dockerhub |
DockerHubRegistry |
Docker Hub (docker.io) |
ecr |
ECRRegistry |
Amazon ECR |
gcr |
GCRRegistry |
Google Container Registry |
local |
LocalRegistry |
Local registry:2 container (dev/CI use) |
Contributing a custom backend
Custom build service
Subclass DockerBuildService and implement build_and_push:
from metaflow_extensions.prebuilt.plugins.conda.build_service import DockerBuildService
class MyBuildService(DockerBuildService):
def build_and_push(self, dockerfile, context_files, image_tag,
push_credentials, echo) -> bool:
# build and push; return True on success, False on failure
...
Register it in your pyproject.toml:
[project.entry-points."metaflow_prebuilt.build_services"]
mybuild = "mypackage.my_module:MyBuildService"
Then use it with METAFLOW_PREBUILT_BUILD_SERVICE=mybuild.
Custom image registry
Subclass ImageRegistry and implement image_tag, push_credentials, and
pull_config:
from metaflow_extensions.prebuilt.plugins.conda.image_registry import ImageRegistry
class MyRegistry(ImageRegistry):
def image_tag(self, env_id) -> str: ...
def push_credentials(self) -> dict: ...
def pull_config(self, pull_tag) -> dict: ...
Register it:
[project.entry-points."metaflow_prebuilt.image_registries"]
myregistry = "mypackage.my_module:MyRegistry"
Then use it with METAFLOW_PREBUILT_IMAGE_REGISTRY=myregistry.
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file metaflow_prebuilt-0.2.0.tar.gz.
File metadata
- Download URL: metaflow_prebuilt-0.2.0.tar.gz
- Upload date:
- Size: 20.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b69c76aae8e14ba4bf8004bd0e172deeff59a8b4cfc98234e4cb6e0fd04b8aa7
|
|
| MD5 |
680d6f6857aa59e2116751bd78a87821
|
|
| BLAKE2b-256 |
4d2f1ce0bf8ac5de33ab19ec208603315658734dc6c23a4d82fc8af0e8b879ff
|
Provenance
The following attestation bundles were made for metaflow_prebuilt-0.2.0.tar.gz:
Publisher:
publish.yml on npow/metaflow-prebuilt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
metaflow_prebuilt-0.2.0.tar.gz -
Subject digest:
b69c76aae8e14ba4bf8004bd0e172deeff59a8b4cfc98234e4cb6e0fd04b8aa7 - Sigstore transparency entry: 1770209744
- Sigstore integration time:
-
Permalink:
npow/metaflow-prebuilt@2bd64aefb6baf91a0551a831352412980c40d607 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/npow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2bd64aefb6baf91a0551a831352412980c40d607 -
Trigger Event:
release
-
Statement type:
File details
Details for the file metaflow_prebuilt-0.2.0-py3-none-any.whl.
File metadata
- Download URL: metaflow_prebuilt-0.2.0-py3-none-any.whl
- Upload date:
- Size: 27.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ed76018a0ded292d70ef64976eb28e3a6b17557427c775a0e05fc09ba7d512f
|
|
| MD5 |
3abf3989e9ec3c024034a075a526308b
|
|
| BLAKE2b-256 |
fe565bd8d59a3abcb1e19d9c7098db65dafe3fcbe59cca559735c03510dbe4f3
|
Provenance
The following attestation bundles were made for metaflow_prebuilt-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on npow/metaflow-prebuilt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
metaflow_prebuilt-0.2.0-py3-none-any.whl -
Subject digest:
1ed76018a0ded292d70ef64976eb28e3a6b17557427c775a0e05fc09ba7d512f - Sigstore transparency entry: 1770210152
- Sigstore integration time:
-
Permalink:
npow/metaflow-prebuilt@2bd64aefb6baf91a0551a831352412980c40d607 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/npow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2bd64aefb6baf91a0551a831352412980c40d607 -
Trigger Event:
release
-
Statement type: