Skip to main content

DLT-META Framework

Project description

DLT-META

Documentation | Release Notes | Examples


Documentation Status Latest Python Release GitHub Workflow Status (branch) codecov downloads We use flake8 for formatting

lines of code


Project Overview

DLT-META is a metadata-driven framework designed to work with Delta Live Tables. This framework enables the automation of bronze and silver data pipelines by leveraging metadata recorded in an onboarding JSON file. This file, known as the Dataflowspec, serves as the data flow specification, detailing the source and target metadata required for the pipelines.

In practice, a single generic DLT pipeline reads the Dataflowspec and uses it to orchestrate and run the necessary data processing workloads. This approach streamlines the development and management of data pipelines, allowing for a more efficient and scalable data processing workflow

Components:

Metadata Interface

Generic DLT pipeline

  • Apply appropriate readers based on input metadata
  • Apply data quality rules with DLT expectations
  • Apply CDC apply changes if specified in metadata
  • Builds DLT graph based on input/output metadata
  • Launch DLT pipeline

High-Level Process Flow:

DLT-META High-Level Process Flow

Steps

DLT-META Stages

Getting Started

Refer to the Getting Started

Databricks Labs DLT-META CLI lets you run onboard and deploy in interactive python terminal

pre-requisites:

  • Python 3.8.0 +

  • Databricks CLI v0.213 or later. See instructions

  • Install Databricks CLI on macOS:

  • macos_install_databricks

  • Install Databricks CLI on Windows:

  • windows_install_databricks.png

Once you install Databricks CLI, authenticate your current machine to a Databricks Workspace:

databricks auth login --host WORKSPACE_HOST
To enable debug logs, simply add `--debug` flag to any command.

Installing dlt-meta:

  • Install dlt-meta via Databricks CLI:
    databricks labs install dlt-meta

Onboard using dlt-meta CLI:

If you want to run existing demo files please follow these steps before running onboard command:

    git clone https://github.com/databrickslabs/dlt-meta.git
    cd dlt-meta
    python -m venv .venv
    source .venv/bin/activate
    pip install databricks-sdk
    dlt_meta_home=$(pwd)
    export PYTHONPATH=$dlt_meta_home
    databricks labs dlt-meta onboard

onboardingDLTMeta.gif

Above commands will prompt you to provide onboarding details. If you have cloned dlt-meta git repo then accept defaults which will launch config from demo folder. onboardingDLTMeta_2.gif

  • Goto your databricks workspace and located onboarding job under: Workflow->Jobs runs

depoly using dlt-meta CLI:

  • Once onboarding jobs is finished deploy bronze and silver DLT using below command
  •    databricks labs dlt-meta deploy
    
    • Above command will prompt you to provide dlt details. Please provide respective details for schema which you provided in above steps
    • Bronze DLT

deployingDLTMeta_bronze.gif

  • Silver DLT
    •    databricks labs dlt-meta deploy
      
    • Above command will prompt you to provide dlt details. Please provide respective details for schema which you provided in above steps

deployingDLTMeta_silver.gif

More questions

Refer to the FAQ and DLT-META documentation

Project Support

Please note that all projects released under Databricks Labs are provided for your exploration only, and are not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS and we do not make any guarantees of any kind. Please do not submit a support ticket relating to any issues arising from the use of these projects.

Any issues discovered through the use of this project should be filed as issues on the Github Repo.
They will be reviewed as time permits, but there are no formal SLAs for support.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

dlt_meta-0.0.8-py3-none-any.whl (42.7 kB view details)

Uploaded Python 3

File details

Details for the file dlt_meta-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: dlt_meta-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 42.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for dlt_meta-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 ff65bff3e99a23461e4f77526cd307b44793ceb047e8f819f8e75f0e7a875ca9
MD5 994902c8f87e43b6a3356e31cf2e8d10
BLAKE2b-256 34d4fe556c8d2e1abc1d4fb71302ccb39508b8a22560f543114741b229d80f03

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page