Skip to main content

No project description provided

Project description

🧱 Fabricks: Simplifying Databricks Data Pipelines

Fabricks (Framework for Databricks) is a Python framework designed to streamline the creation of lakehouses in Databricks. It offers a standardized approach to defining and managing data processing workflows, making it easier to build and maintain robust data pipelines.

🌟 Key Features

  • 📄 YAML Configuration: Easy-to-modify workflow definitions
  • 🔍 SQL-Based Business Logic: Familiar and powerful data processing
  • 🔄 Version Control: Track changes and roll back when needed
  • 🔌 Seamless Data Source Integration: Effortlessly add new sources
  • 📊 Change Data Capture (CDC): Track and handle data changes over time
  • 🔧 Flexible Schema Management: Drop and create tables as needed

🚀 Getting Started

📦 Installation

  1. Navigate to your Databricks workspace
  2. Select your target cluster
  3. Click on the Libraries tab
  4. Choose Install New
  5. Select PyPI as the library source
  6. Enter fabricks in the package text box
  7. Click Install

Once installed, import Fabricks in your notebooks or scripts:

import fabricks

🏗️ Project Configuration

🔧 Runtime Configuration

Define your Fabricks runtime in a YAML file. Here's a basic structure:

name: MyFabricksProject
options:
  secret_scope: my_secret_scope
  timeout: 3600
  workers: 4
path_options:
  storage: /mnt/data
spark_options:
  sql:
    option1: value1
    option2: value2

# Pipeline Stages
bronze:
  name: Bronze Stage
  path_options:
    storage: /mnt/bronze
  options:
    option1: value1

silver:
  name: Silver Stage
  path_options:
    storage: /mnt/silver
  options:
    option1: value1

gold:
  name: Gold Stage
  path_options:
    storage: /mnt/gold
  options:
    option1: value1

🥉 Bronze Step

The initial stage for raw data ingestion:

- job:
    step: bronze
    topic: sales_data
    item: daily_transactions
    tags: [raw, sales]
    options:
      mode: append
      uri: abfss://fabricks@$datahub/raw/sales
      parser: parquet
      keys: [transaction_id]
      source: pos_system

🥈 Silver Step

The intermediate stage for data processing:

- job:
    step: silver
    topic: sales_analytics
    item: daily_summary
    tags: [processed, sales]
    options:
      mode: update
      change_data_capture: scd1
      parents: [bronze.daily_transactions]
      extender: sales_extender
      check_options:
        max_rows: 1000000

🥇 Gold Step

The final stage for data consumption:

- job:
    step: gold
    topic: sales_reports
    item: monthly_summary
    tags: [report, sales]
    options:
      mode: complete
      change_data_capture: scd2

📚 Usage

// Detailed usage instructions to be added here

📄 License

This project is licensed under the MIT License.


For more information, visit Fabricks Documentation 📚

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fabricks-0.9.6.tar.gz (71.3 kB view details)

Uploaded Source

Built Distribution

fabricks-0.9.6-py3-none-any.whl (116.3 kB view details)

Uploaded Python 3

File details

Details for the file fabricks-0.9.6.tar.gz.

File metadata

  • Download URL: fabricks-0.9.6.tar.gz
  • Upload date:
  • Size: 71.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fabricks-0.9.6.tar.gz
Algorithm Hash digest
SHA256 e2763094bcfcf8fb92e68658e474a7a2924ddb131a460fcc680e85170b7097da
MD5 aaec1ca703e5bc6ec8a60ee6489ac8c9
BLAKE2b-256 5f0c8aebf73f5fe351a4372b7a00f7bbecdeca994b7018abf201ed7ff56a9d59

See more details on using hashes here.

Provenance

The following attestation bundles were made for fabricks-0.9.6.tar.gz:

Publisher: python-publish.yml on fabricks-framework/fabricks

Attestations:

File details

Details for the file fabricks-0.9.6-py3-none-any.whl.

File metadata

  • Download URL: fabricks-0.9.6-py3-none-any.whl
  • Upload date:
  • Size: 116.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fabricks-0.9.6-py3-none-any.whl
Algorithm Hash digest
SHA256 f9f0b4a5119bec612ac3653ed1555440739377fe12f39c703f365067666b0a60
MD5 5770be7c6b296d3ad0f24b6a5fe1892b
BLAKE2b-256 2571386db715efb88eacb4cc1c6494352c63132db6a371847d67fc27d57cedfc

See more details on using hashes here.

Provenance

The following attestation bundles were made for fabricks-0.9.6-py3-none-any.whl:

Publisher: python-publish.yml on fabricks-framework/fabricks

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page