Skip to main content

No project description provided

Project description

🧱 Fabricks: Simplifying Databricks Data Pipelines

Fabricks (Framework for Databricks) is a Python framework designed to streamline the creation of lakehouses in Databricks. It offers a standardized approach to defining and managing data processing workflows, making it easier to build and maintain robust data pipelines.

🌟 Key Features

  • 📄 YAML Configuration: Easy-to-modify workflow definitions
  • 🔍 SQL-Based Business Logic: Familiar and powerful data processing
  • 🔄 Version Control: Track changes and roll back when needed
  • 🔌 Seamless Data Source Integration: Effortlessly add new sources
  • 📊 Change Data Capture (CDC): Track and handle data changes over time
  • 🔧 Flexible Schema Management: Drop and create tables as needed

🚀 Getting Started

📦 Installation

  1. Navigate to your Databricks workspace
  2. Select your target cluster
  3. Click on the Libraries tab
  4. Choose Install New
  5. Select PyPI as the library source
  6. Enter fabricks in the package text box
  7. Click Install

Once installed, import Fabricks in your notebooks or scripts:

import fabricks

🏗️ Project Configuration

🔧 Runtime Configuration

Define your Fabricks runtime in a YAML file. Here's a basic structure:

name: MyFabricksProject
options:
  secret_scope: my_secret_scope
  timeout: 3600
  workers: 4
path_options:
  storage: /mnt/data
spark_options:
  sql:
    option1: value1
    option2: value2

# Pipeline Stages
bronze:
  name: Bronze Stage
  path_options:
    storage: /mnt/bronze
  options:
    option1: value1

silver:
  name: Silver Stage
  path_options:
    storage: /mnt/silver
  options:
    option1: value1

gold:
  name: Gold Stage
  path_options:
    storage: /mnt/gold
  options:
    option1: value1

🥉 Bronze Step

The initial stage for raw data ingestion:

- job:
    step: bronze
    topic: sales_data
    item: daily_transactions
    tags: [raw, sales]
    options:
      mode: append
      uri: abfss://fabricks@$datahub/raw/sales
      parser: parquet
      keys: [transaction_id]
      source: pos_system

🥈 Silver Step

The intermediate stage for data processing:

- job:
    step: silver
    topic: sales_analytics
    item: daily_summary
    tags: [processed, sales]
    options:
      mode: update
      change_data_capture: scd1
      parents: [bronze.daily_transactions]
      extender: sales_extender
      check_options:
        max_rows: 1000000

🥇 Gold Step

The final stage for data consumption:

- job:
    step: gold
    topic: sales_reports
    item: monthly_summary
    tags: [report, sales]
    options:
      mode: complete
      change_data_capture: scd2

📚 Usage

// Detailed usage instructions to be added here

📄 License

This project is licensed under the MIT License.


For more information, visit Fabricks Documentation 📚

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fabricks-0.9.7.1.tar.gz (71.6 kB view details)

Uploaded Source

Built Distribution

fabricks-0.9.7.1-py3-none-any.whl (116.6 kB view details)

Uploaded Python 3

File details

Details for the file fabricks-0.9.7.1.tar.gz.

File metadata

  • Download URL: fabricks-0.9.7.1.tar.gz
  • Upload date:
  • Size: 71.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fabricks-0.9.7.1.tar.gz
Algorithm Hash digest
SHA256 a5b7bb3a8421de46f916e137ce1629d7df7fa499f48ee7a256c9d979c77493ef
MD5 f9dc7dd0cd70eddff83dc79e41124c6f
BLAKE2b-256 a9a2bfa2d29f8163317f87e3bfec65ca00a82d3dec19e2385b225f9eaacbe5ae

See more details on using hashes here.

Provenance

The following attestation bundles were made for fabricks-0.9.7.1.tar.gz:

Publisher: python-publish.yml on fabricks-framework/fabricks

Attestations:

File details

Details for the file fabricks-0.9.7.1-py3-none-any.whl.

File metadata

  • Download URL: fabricks-0.9.7.1-py3-none-any.whl
  • Upload date:
  • Size: 116.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fabricks-0.9.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 958fbb04142bc3bbffce104ba6dd59f8f96c500925931b715ffc56c6e887faf4
MD5 a7609cc82d2932cbf39b51d845f65e61
BLAKE2b-256 40c498d8ca233b9513dd937ca0f3e2b332ad5c76bdc31138628da4b3bdb4d0db

See more details on using hashes here.

Provenance

The following attestation bundles were made for fabricks-0.9.7.1-py3-none-any.whl:

Publisher: python-publish.yml on fabricks-framework/fabricks

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page