Tekton Compiler for Kubeflow Pipelines
kfp-tekton SDK is extending the
Compiler and the
Client of the Kubeflow
Pipelines SDK to generate Tekton YAML
and to subsequently upload and run the pipeline with the Kubeflow Pipelines engine
backed by Tekton.
- SDK Packages Overview
- Project Prerequisites
- Compiling a Kubeflow Pipelines DSL Script
- Big data passing workspace configuration
- Running the Compiled Pipeline on a Tekton Cluster
- List of Available Features
- List of Helper Functions for Python Kubernetes Client
- Tested Pipelines
kfp-tekton SDK is an extension to the Kubeflow Pipelines SDK
TektonCompiler and the
kfp_tekton.compilerincludes classes and methods for compiling pipeline Python DSL into a Tekton PipelineRun YAML spec. The methods in this package include, but are not limited to, the following:
kfp_tekton.compiler.TektonCompiler.compilecompiles your Python DSL code into a single static configuration (in YAML format) that the Kubeflow Pipelines service can process. The Kubeflow Pipelines service converts the static configuration into a set of Kubernetes resources for execution.
kfp_tekton.TektonClientcontains the Python client libraries for the Kubeflow Pipelines API. Methods in this package include, but are not limited to, the following:
kfp_tekton.TektonClient.upload_pipelineuploads a local file to create a new pipeline in Kubeflow Pipelines.
kfp_tekton.TektonClient.create_experimentcreates a pipeline experiment and returns an experiment object.
kfp_tekton.TektonClient.run_pipelineruns a pipeline and returns a run object.
kfp_tekton.TektonClient.create_run_from_pipeline_funccompiles a pipeline function and submits it for execution on Kubeflow Pipelines.
kfp_tekton.TektonClient.create_run_from_pipeline_packageruns a local pipeline package on Kubeflow Pipelines.
- Tekton CLI:
- Kubeflow Pipelines: KFP with Tekton backend
Follow the instructions for installing project prerequisites and take note of some important caveats.
You can install the latest release of the
kfp-tekton compiler from
PyPi. We recommend to create a Python
virtual environment first:
python3 -m venv .venv source .venv/bin/activate pip install kfp-tekton
Alternatively you can install the latest version of the
from the source by cloning the repository https://github.com/kubeflow/kfp-tekton:
git clone https://github.com/kubeflow/kfp-tekton.git cd kfp-tekton
Setup Python environment with Conda or a Python virtual environment:
python3 -m venv .venv source .venv/bin/activate
Build the compiler:
pip install -e sdk/python
Run the compiler tests (optional):
pip install pytest make test
kfp-tekton Python package comes with the
dsl-compile-tekton command line
executable, which should be available in your terminal shell environment after
kfp-tekton Python package.
If you cloned the
kfp-tekton project, you can find example pipelines in the
samples folder or under
dsl-compile-tekton \ --py sdk/python/tests/compiler/testdata/parallel_join.py \ --output pipeline.yaml
Note: If the KFP DSL script contains a
__main__ method calling the
if __name__ == "__main__": from kfp_tekton.compiler import TektonCompiler TektonCompiler().compile(pipeline_func, "pipeline.yaml")
... then the pipeline can be compiled by running the DSL script with
executable from a command line shell, producing a Tekton YAML file
in the same directory:
When big data files are defined in KFP. Tekton will create a workspace to share these big data files among tasks that run in the same pipeline. By default, the workspace is a Read Write Many PVC with 2Gi storage using the kfp-csi-s3 storage class to push artifacts to S3. But you can change these configuration using the environment variables below:
export DEFAULT_ACCESSMODES=ReadWriteMany export DEFAULT_STORAGE_SIZE=2Gi export DEFAULT_STORAGE_CLASS=kfp-csi-s3
To pass big data using cloud provider volumes, it's recommended to use the volume_based_data_passing_method for both Tekton and Argo runtime.
If you want to change the input and output copy artifact images, please modify the following environment variables:
export TEKTON_BASH_STEP_IMAGE=busybox # input and output copy artifact images export TEKTON_COPY_RESULTS_STEP_IMAGE=library/bash # output copy results images export CONDITION_IMAGE_NAME=python:3.9.17-alpine3.18 # condition task default image name
After compiling the
sdk/python/tests/compiler/testdata/parallel_join.py DSL script
in the step above, we need to deploy the generated Tekton YAML to Kubeflow Pipeline engine.
You can run the pipeline directly using a pre-compiled file and KFP-Tekton SDK. For more details, please look at the KFP-Tekton user guide SDK documentation
experiment = kfp_tekton.TektonClient.create_experiment(name=EXPERIMENT_NAME, namespace=KUBEFLOW_PROFILE_NAME) run = client.run_pipeline(experiment.id, 'parallal-join-pipeline', 'pipeline.yaml')
You can also deploy directly on Tekton cluster with
kubectl. The Tekton server will automatically start a pipeline run.
We can then follow the logs using the
kubectl apply -f pipeline.yaml tkn pipelinerun logs --last --follow
Once the Tekton Pipeline is running, the logs should start streaming:
Waiting for logs to be available... [gcs-download : main] With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate [gcs-download-2 : main] I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath [echo : main] Text 1: With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate [echo : main] [echo : main] Text 2: I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath [echo : main]
To understand how each feature is implemented and its current status, please visit the FEATURES doc.
KFP Tekton provides a list of common Kubernetes client helper functions to simplify the process of creating certain Kubernetes resources. please visit the K8S_CLIENT_HELPER doc for more details.
We are testing the compiler on more than 80 pipelines
found in the Kubeflow Pipelines repository, specifically the pipelines in KFP compiler
testdata folder, the KFP core samples and the samples contributed by third parties.
A report card of Kubeflow Pipelines samples that are currently supported by the
compiler can be found here.
If you work on a PR that enables another of the missing features please ensure that
your code changes are improving the number of successfully compiled KFP pipeline samples.
When you encounter ServiceAccount related permission issues, refer to the "Service Account and RBAC" doc
If you run into the error
bad interpreter: No such file or directorwhen trying to use Python's venv, remove the current virtual environment in the
.venvdirectory and create a new one using
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.