Skip to main content

DataLake tables management bundle for the Bricksflow Framework

Project description

Datalake bundle

alt text

This bundle provides everything you need to create and manage a Databricks-based DataLake(House):

  • Tools to simplify & automate table creation, updates and migrations.
  • Explicit table schema enforcing for Hive tables, CSVs, ...
  • Decorators to write well-maintainable and self-documented function-based notebooks
  • Rich configuration options to customize naming standards, paths, and basically anything to match your needs

Installation

Install the bundle via Poetry:

$ poetry add datalake-bundle

Usage

  1. Recommended notebooks structure
  2. Defining DataLake tables
  3. Using datalake-specific notebook functions
  4. Using table-specific configuration
  5. Tables management
  6. Parsing fields from table identifier

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datalake-bundle-0.5.0a13.tar.gz (14.9 kB view hashes)

Uploaded Source

Built Distribution

datalake_bundle-0.5.0a13-py3-none-any.whl (30.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page