Skip to main content

Airflow plugin for Google Cloud Run Jobs

Project description

airflow-google-cloud-run-plugin

PyPI version PyPI - Downloads PyPI - Python Version Code style: black

Airflow plugin for orchestrating Google Cloud Run jobs.

Features

  1. Easier to use alternative to KubernetesPodOperator
  2. Securely use sensitive data stored in Google Cloud Secrets Manager
  3. Create tasks with isolated dependencies
  4. Enables polyglot workflows

Resources

Core Operators

  1. CloudRunJobOperator

CRUD-Based Operators

  1. CloudRunCreateJobOperator
  2. CloudRunGetJobOperator 🔜
  3. CloudRunUpdateJobOperator 🔜
  4. CloudRunDeleteJobOperator
  5. CloudRunListJobsOperator 🔜

Hooks

  1. CloudRunJobHook

Sensors

  1. CloudRunJobExecutionSensor 🔜

Usage

Simple Job Lifecycle

from airflow import DAG

from airflow_google_cloud_run_plugin.operators.cloud_run import CloudRunJobOperator

with DAG(dag_id="example_dag") as dag:
  job = CloudRunJobOperator(
    task_id="example-job",
    name="example-job",
    location="us-central1",
    project_id="example-project",
    image="gcr.io/gcp-runtimes/ubuntu_18_0_4",
    command=["echo"],
    cpu="1000m",
    memory="512Mi",
    create_if_not_exists=True,
    delete_on_exit=True
  )

CRUD Job Lifecycle

from airflow import DAG

from airflow_google_cloud_run_plugin.operators.cloud_run import (
  CloudRunJobOperator,
  CloudRunCreateJobOperator,
  CloudRunDeleteJobOperator,
)

with DAG(dag_id="example_dag") as dag:
  create_job = CloudRunCreateJobOperator(
    task_id="create",
    name="example-job",
    location="us-central1",
    project_id="example-project",
    image="gcr.io/gcp-runtimes/ubuntu_18_0_4",
    command=["echo"],
    cpu="1000m",
    memory="512Mi"
  )

  run_job = CloudRunJobOperator(
    task_id="run",
    name="example-job",
    location="us-central1",
    project_id="example-project"
  )

  delete_job = CloudRunDeleteJobOperator(
    task_id="delete",
    name="example-job",
    location="us-central1",
    project_id="example-project"
  )

  create_job >> run_job >> delete_job

Using Environment Variables

from airflow import DAG

from airflow_google_cloud_run_plugin.operators.cloud_run import CloudRunJobOperator

# Simple environment variable
FOO = {
  "name": "FOO",
  "value": "not_so_secret_value_123"
}

# Environment variable from Secret Manager
BAR = {
  "name": "BAR",
  "valueFrom": {
    "secretKeyRef": {
      "name": "super_secret_password",
      "key": "1"  # or "latest" for latest secret version
    }
  }
}

with DAG(dag_id="example_dag") as dag:
  job = CloudRunJobOperator(
    task_id="example-job",
    name="example-job",
    location="us-central1",
    project_id="example-project",
    image="gcr.io/gcp-runtimes/ubuntu_18_0_4",
    command=["echo"],
    args=["$FOO", "$BAR"],
    env_vars=[FOO, BAR],
    cpu="1000m",
    memory="512Mi",
    create_if_not_exists=True,
    delete_on_exit=True
  )

Improvement Suggestions

  • Add support for Cloud Run services
  • Nicer user experience for defining args and commands
  • Use approach from other GCP operators once this issue is resolved https://github.com/googleapis/python-run/issues/64
  • Add operators for all CRUD operations
  • Add run sensor (see link)
  • Enable volume mounts (see TaskSpec)
  • Allow user to configure resource requirements requests ( see ResourceRequirements)
  • Add remaining container options (see Container)
  • Allow non-default credentials and for user to specify service account ( see link)
  • Allow failure threshold. If more than one task is specified, user should be allowed to specify number of failures allowed
  • Add custom links for log URIs
  • Add wrapper class for easier environment variable definition. Similar to Secret from Kubernetes provider ( see link)
  • Add slight time padding between job create and run
  • Add ability to choose to replace the job with new config values if values have changed

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow-google-cloud-run-plugin-0.3.2.tar.gz (12.1 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page