A library with the Lakehouse Framework

These details have not been verified by PyPI

Project links

Project description

Lakehouse-NS gives you a simple framework to implement your lakehouse based on the Medallion Architecture.

Currently, the frameworks supports the Bronze and Silver layer
It currently, supports Spark. (Tested with Spark 4.0) More engines like Daft or Polars are in the backlog
Currently, it supports Delta Lake as lakehouse format
The framework will also be extended step by step with more baseline logic

Some import links:

"Homepage" = "https://github.com/datanikkthegreek/lakehouse-docu"
"API Referance" = "https://datanikkthegreek.github.io/lakehouse-docu/"
"Samples" = "https://github.com/datanikkthegreek/lakehouse-docu/tree/main/samples"
"Source" = "https://github.com/datanikkthegreek/lakehouse"
"Issues" = "https://github.com/datanikkthegreek/lakehouse-docu/issues"
"Project Planning" = "https://github.com/users/datanikkthegreek/projects/1/views/1"
"Get in touch" = "https://www.linkedin.com/in/dr-nikolaos-servos-nikk-the-greek-a29137b3/"

1. Set-Up

Requires to have installed one of the following:

pyspark and delta-spark
Databricks Connect
Spark and Delta Connect
default spark session on Databricks or Fabric

pip install lakehouse-ns

Also you need to have a catalog set-up and your bronze and silver schema(s)

That's already it!

2. Get Started

Just import the Bronze and Silver classes and overwrite the load or transform functions. That's it.

from lakehouse import bronze, silver

spark = <Your Spark Session>

#Create your schemas
spark.sql(f"CREATE SCHEMA IF NOT EXISTS <catalog>.<schema>")

options = {
    "catalog": "<catalog>",
    "target_schema": "<schema>" 
}


class StarWarsBronze(bronze.BronzeOverwrite):
    def load(self, table):
        return spark.read.format("SWAPI").load(table)
    
bronze_instance = StarWarsBronze(spark, **options)
bronze_instance.execute_one("people")

See detailed samples here: https://github.com/datanikkthegreek/lakehouse-docu/tree/main/samples

3. Options

You can/must pass in the Bronze and Silver class the following options. Besides you can specifiy any custom options which you can access via self.options in your class.

Option	Description	Type	Default	Bronze	Silver
catalog	The name of the catalog, e.g. spark_catalog, hive_metastore or any other custom catalog	String	To be defined	Required	Required
source_schema	The schema from which the data is loaded	String	To be defined	Not Required	Required
target_schema	The schema to which the data are written	String	To be defined	Required	Required
merge_schema	If the schema should be automatically envolved/merged	Boolean	FALSE	Optional	Optional

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.2.0.dev2 pre-release

Mar 31, 2025

2.2.0.dev1 pre-release

Mar 31, 2025

2.1.1

Sep 8, 2025

2.1.0

Sep 6, 2025

2.1.0.dev13 pre-release

Sep 6, 2025

2.1.0.dev12 pre-release

Sep 6, 2025

2.1.0.dev10 pre-release

Sep 6, 2025

2.1.0.dev9 pre-release

Sep 6, 2025

2.1.0.dev8 pre-release

Sep 6, 2025

2.1.0.dev7 pre-release

Sep 6, 2025

2.1.0.dev6 pre-release

Mar 30, 2025

2.1.0.dev4 pre-release

Mar 30, 2025

2.1.0.dev3 pre-release

Mar 30, 2025

2.1.0.dev2 pre-release

Mar 16, 2025

2.1.0.dev1 pre-release

Mar 16, 2025

2.0.0

Mar 16, 2025

2.0.0.dev3 pre-release

Mar 15, 2025

2.0.0.dev2 pre-release

Mar 15, 2025

2.0.0.dev1 pre-release

Mar 15, 2025

1.0.0

Feb 28, 2025

1.0.0.dev6 pre-release

Feb 27, 2025

1.0.0.dev5 pre-release

Feb 27, 2025

1.0.0.dev4 pre-release

Feb 21, 2025

1.0.0.dev3 pre-release

Feb 21, 2025

1.0.0.dev2 pre-release

Feb 21, 2025

1.0.0.dev1 pre-release

Feb 21, 2025

0.4.0

Feb 19, 2025

0.4.0.dev1 pre-release

Feb 17, 2025

0.3.0

Jan 11, 2025

0.3.0.dev2 pre-release

Jan 11, 2025

0.2.1

Jan 4, 2025

0.2.0

Jan 3, 2025

This version

0.1.1

Dec 8, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lakehouse_ns-0.1.1-py311-none-any.whl (24.7 kB view details)

Uploaded Dec 8, 2024 Python 3.11

File details

Details for the file lakehouse_ns-0.1.1-py311-none-any.whl.

File metadata

Download URL: lakehouse_ns-0.1.1-py311-none-any.whl
Upload date: Dec 8, 2024
Size: 24.7 kB
Tags: Python 3.11
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.10

File hashes

Hashes for lakehouse_ns-0.1.1-py311-none-any.whl
Algorithm	Hash digest
SHA256	`185120e0ac473ad614b377eb4a1f4490c0c856d3ce0bea03b0285e56dff5a633`
MD5	`5d1ac7f6861868b9543dccc378b1eacf`
BLAKE2b-256	`a642533f727f6091d9ab0a7691c21b452dd5eb4f7b7dff5fea595d3b1f662b7c`

See more details on using hashes here.

lakehouse-ns 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

1. Set-Up

2. Get Started

3. Options

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes