Skip to main content

Orchestration Pipelines

Project description

Orchestration-pipelines

PyPI version Python Versions Support Status License

A library for defining and generating Apache Airflow DAGs declaratively using YAML. Currently focused on orchestration of GCP resources (Dataproc, BigQuery, Dataform) and DBT.

[!NOTE] This library is currently in Preview.

Overview

orchestration-pipelines allows you to define complex data workflows in simple, human-readable YAML files. It abstracts away the boilerplate of writing Airflow DAGs in Python, making it easier for non-Python experts to create and manage pipelines.

Supported Python Versions

Python >= 3.9

Features

  • Declarative DAGs: Define your pipeline structure, triggers, and actions in YAML.
  • Rich Actions Support: Built-in support for:
    • Python Scripts
    • Google Cloud BigQuery
    • Google Cloud Dataproc (Serverless, Ephemeral and existing clusters)
    • Google Cloud Dataform
    • DBT
  • Automatic Generation: A simple Python call generates the full Airflow DAG.
  • Versioning: Supports versioning of pipelines via a manifest file(as of Preview, on Google Cloud Composer).

Installation

You can install orchestration-pipelines from PyPI:

pip install orchestration-pipelines

[!IMPORTANT] Ensure your apache-airflow-client version is fully compatible with Airflow 3 to prevent critical DAG parsing or runtime errors. This package utilizes Airflow Client API calls to interact with the metadata database; apache-airflow-client library introduces significant architectural shifts in newer versions, a version mismatch will likely break communication and disrupt your pipelines. Always verify that your client version aligns with your Airflow environment to ensure stability.

Quick Start

1. Define your pipeline in YAML

Create a file named my_pipeline.yml:

modelVersion: "1.0"
pipelineId: "my_pipeline"
description: "A simple example pipeline"
runner: "airflow"

defaults:
  projectId: "your-gcp-project"
  location: "us-central1"

triggers:
  - schedule:
      interval: "0 4 * * *"
      startTime: "2026-01-01T00:00:00"
      catchup: false

actions:
  - sql:
      name: "create_table"
      query:
        inline: "CREATE TABLE IF NOT EXISTS `your-gcp-project.my_dataset.my_table` (id INT64, name STRING);"
      engine:
        bigquery:
          location: "US"

2. Generate the Airflow DAG

Create a Python file named my_pipeline.py in your Airflow DAGs folder:

from orchestration_pipelines_lib.api import generate

# Generate Airflow DAG from pipeline definition file
# airflow | dag
# Root is "dags" directory in Composer bucket
generate("dataform-pipeline-local.yml")

Airflow will parse this Python file and automatically generate the DAG based on your YAML definition.

Advanced Features

Versioning and Manifests

You can manage multiple versions of your pipelines using a manifest.yaml file. This allows you to specify which version of a pipeline should be active.

See the examples/ directory for a sample manifest.yaml and how to use it.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orchestration_pipelines-0.1.2.tar.gz (46.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orchestration_pipelines-0.1.2-py3-none-any.whl (83.3 kB view details)

Uploaded Python 3

File details

Details for the file orchestration_pipelines-0.1.2.tar.gz.

File metadata

  • Download URL: orchestration_pipelines-0.1.2.tar.gz
  • Upload date:
  • Size: 46.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for orchestration_pipelines-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0022f6d0037547cdc2d363872de88fb8f0149dbecc5e8ab6ab375cbae5dfe8f6
MD5 6d142de02cd0a4bcd32984082476cffb
BLAKE2b-256 ad15c91b539b530859807c2f72775368213d2a46ffa8990091e4c7d8c48e499d

See more details on using hashes here.

Provenance

The following attestation bundles were made for orchestration_pipelines-0.1.2.tar.gz:

Publisher: orchestration-pipelines-py@oss-exit-gate-prod.iam.gserviceaccount.com

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.
  • Statement: Publication detail:
    • Token Issuer: https://accounts.google.com
    • Service Account: orchestration-pipelines-py@oss-exit-gate-prod.iam.gserviceaccount.com

File details

Details for the file orchestration_pipelines-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for orchestration_pipelines-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3d66d27fb81162d7d2025fb4871354336ecaf7e04923453c72d3e97b9bce9841
MD5 014032165060735f2bfee64ba168a959
BLAKE2b-256 8b27206ea4f22bedc7a35399f47e0be48ab910d7248685672a63a7a5cc828920

See more details on using hashes here.

Provenance

The following attestation bundles were made for orchestration_pipelines-0.1.2-py3-none-any.whl:

Publisher: orchestration-pipelines-py@oss-exit-gate-prod.iam.gserviceaccount.com

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.
  • Statement: Publication detail:
    • Token Issuer: https://accounts.google.com
    • Service Account: orchestration-pipelines-py@oss-exit-gate-prod.iam.gserviceaccount.com

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page