Skip to main content
Join the official 2020 Python Developers SurveyStart the survey!

A tool for generating end-to-end pipelines on GCP.

Project description

ML Pipeline Generator

ML Pipeline Generator is a tool for generating end-to-end pipelines composed of GCP components so that users can easily migrate their local ML models onto GCP and start realizing the benefits of the Cloud quickly.

The following ML frameworks will be supported:

  1. TensorFlow (TF)
  2. Scikit-learn (SKL)
  3. XGBoost (XGB)

The following backends are currently supported for model training:

  1. Google Cloud AI Platform
  2. AI Platform Pipelines (managed Kubeflow Pipelines)


GCP credentials

gcloud auth login
gcloud auth application-default login
gcloud config set project [PROJECT_ID]

Enabling required APIs

The tool requires following Google Cloud APIs to be enabled:

  1. Compute Engine
  2. AI Platform Training and Prediction
  3. Cloud Storage

Enable the above APIs by following the links, or run the below command to enable the APIs for your project.

gcloud services enable \ \

Python environment

python3 -m venv venv
source ./venv/bin/activate
pip install -r requirements.txt

Config file

Update the information in config.yaml. See the Input args section below for details on the config parameters.


Create a Kubeflow deployment using Cloud Marketplace. Follow these instructions to give the Kubeflow instance access to GCP services.

A future release will automate provisioning of KFP clusters and incorporate K8s Workload Identity for auth.

Cloud AI Platform Demo

This demo uses the scikit-learn model in examples/sklearn/ to create a training module to run on CAIP.

python -m examples.sklearn.demo

Running this demo uses the config file to generate bin/ along with trainer/ code. Then, run bin/ to train locally or bin/ cloud to train on Google Cloud AI Platform.

KFP Demo

This demo uses the scikit-learn model in examples/sklearn/ to create a KubeFlow Pipeline (hosted on AI Platform Pipelines).

python -m examples.kfp.demo
python -m orchestration.pipeline


Delete the generated files by running bin/


The tests use unittest, Python's built-in unit testing framework. By running python -m unittest, the framework performs test discovery to find all tests within this project. Tests can be run on a more granular level by feeding a directory to test discover. Read more about unittest here.

python -m unittest

Input args

The following input args are included by default. Overwrite them by adding them as inputs in the config file.

Arg Description
train_path Dir or bucket containing train data.
eval_path Dir or bucket containing eval data.
model_dir Dir or bucket to save model files.
batch_size Number of rows of data to be fed into the model each iteration.
max_steps The maximum number of iterations to train the model for.
learning_rate Multiplier that controls how much the weights of our network are adjusted with respoect to the loss gradient.
export_format File format expected by the exported model at inference time.
save_checkpoints_steps Number of steps to run before saving a model checkpoint.
keep_checkpoint_max Number of model checkpoints to keep.
log_step_count_steps Number of steps to run before logging training performance.
eval_steps Number of steps to use to evaluate the model.
early_stopping_steps Number of steps with no loss decrease before stopping early.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for ml-pipeline-gen, version 0.0.1
Filename, size File type Python version Upload date Hashes
Filename, size ml_pipeline_gen-0.0.1-py3-none-any.whl (46.8 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size ml-pipeline-gen-0.0.1.tar.gz (23.5 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page