Skip to main content

A plugin for Apache Airflow to interact with Microsoft Fabric items with small custom change

Project description

Apache Airflow Plugin for Microsoft Fabric Plugin. 🚀

Introduction

A Python package that helps Data and Analytics engineers trigger run on demand job items of Microsoft Fabric in Apache Airflow DAGs.

Microsoft Fabric is an end-to-end analytics and data platform designed for enterprises that require a unified solution. It encompasses data movement, processing, ingestion, transformation, real-time event routing, and report building. It offers a comprehensive suite of services including Data Engineering, Data Factory, Data Science, Real-Time Analytics, Data Warehouse, and Databases.

How to Use

Prerequisities

Before diving in,

  • The plugin supports the authentication using user tokens. Tenant level admin account must enable the setting Allow user consent for apps. Refer to: Configure user consent
  • Create a Microsoft Entra Id app if you don’t have one. Refer to: Doc
  • You must have Refresh token.

Since custom connection forms aren't feasible in Apache Airflow plugins, use can use Generic connection type. Here's what you need to store:

  1. Connection Id: Name of the connection Id
  2. Connection Type: Generic
  3. Login: The Client ID of your service principal.
  4. Password: The refresh token fetched using Microsoft OAuth.
  5. Extra: { "tenantId": The Tenant Id of your service principal. }

Operators

FabricRunItemOperator

This operator composes the logic for this plugin. It triggers the Fabric item run and pushes the details in Xcom. It can accept the following parameters:

  • workspace_id: The workspace Id.
  • item_id: The Item Id. i.e Notebook and Pipeline.
  • fabric_conn_id: Connection Id for Fabric.
  • job_type: "RunNotebook" or "Pipeline".
  • wait_for_termination: (Default value: True) Wait until the run item.
  • timeout: Time in seconds to wait for the pipeline or notebook. Used only if wait_for_termination is True.
  • check_interval: Boolean. Number of seconds to wait before rechecking the refresh status.
  • deferrable: Boolean. Use the operator in deferrable mode.

Features

  • Refresh token rotation:

    Refresh token rotation is a security mechanism that involves replacing the refresh token each time it is used to obtain a new access token. This process enhances security by reducing the risk of stolen tokens being reused indefinitely.

  • Xcom Integration:

    The Fabric run item enriches the Xcom with essential fields for downstream tasks:

    1. run_id: Run Id of the Fabric item.
    2. run_status: Fabric item run status.
      • In Progress: Item run is in progress.
      • Completed: Item run successfully completed.
      • Failed: Item run failed.
      • Disabled: Item run is disabled by a selective refresh.
    3. run_location: The location of item run status.
  • External Monitoring link:

    The operator conveniently provides a redirect link to the Microsoft Fabric item run.

  • Deferable Mode:

    The operator runs in deferrable mode. The operator is deferred until the target status of the item run is achieved.

Sample DAG to use the plugin.

Ready to give it a spin? Check out the sample DAG code below:

from __future__ import annotations

from airflow import DAG
from apache_airflow_microsoft_fabric_plugin.operators.fabric import FabricRunItemOperator
from airflow.utils.dates import days_ago

default_args = {
    "owner": "airflow",
    "start_date": days_ago(1),
}

with DAG(
    dag_id="fabric_items_dag",
    default_args=default_args,
    schedule_interval="@daily",
    catchup=False,
) as dag:

    run_notebook = FabricRunItemOperator(
        task_id="run_fabric_notebook",
        workspace_id="<workspace_id>",
        item_id="<item_id>",
        fabric_conn_id="fabric_conn_id",
        job_type="RunNotebook",
        wait_for_termination=True,
        deferrable=True,
    )

    run_notebook

Feel free to tweak and tailor this DAG to suit your needs!

🌟 Please feel free to share any thoughts or suggestions you have.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file apache_airflow_microsoft_fabric_plugin_custom-1.0.3.tar.gz.

File metadata

File hashes

Hashes for apache_airflow_microsoft_fabric_plugin_custom-1.0.3.tar.gz
Algorithm Hash digest
SHA256 f9f43b1fadbb5390fbafb06eaef3aa1f67c06ac5738026012a60753e27bf7558
MD5 f413b4b3359cdcbd2900c61057e8773a
BLAKE2b-256 e612b19ec8659cf2260b10315a6f38889d6b8ddb5a84383d461cbfbf541039c0

See more details on using hashes here.

File details

Details for the file apache_airflow_microsoft_fabric_plugin_custom-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for apache_airflow_microsoft_fabric_plugin_custom-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b1b5948322c7b911de29181a2b0a082731b06441a6a51a6ed348a95d8bde1868
MD5 54ca2afa8fa543aa3adb70642284b8c3
BLAKE2b-256 3a846ab1ef602d8e48268273515b64be58a9222eb7c27310c50d1d8b4fed698f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page