User-friendly PySpark helpers for Microsoft Fabric Lakehouses and Warehouses
Project description
fabrictools
User-friendly PySpark helpers for Microsoft Fabric — read, write, and merge Lakehouses and Warehouses with a single function call.
Features
- Auto-resolved paths — pass a Lakehouse or Warehouse name, no ABFS URL configuration required
- Auto-detected SparkSession — uses
SparkSession.builder.getOrCreate(), works seamlessly inside Fabric notebooks - Auto-detected format on read — tries Delta → Parquet → CSV automatically
- Delta merge (upsert) — one-liner upsert into any Lakehouse Delta table
- Built-in logging — every operation logs its resolved path, detected format, and row/column count
Requirements
- Microsoft Fabric Spark runtime (provides
notebookutils,pyspark, anddelta-spark) - Python >= 3.9
Local development: install the
sparkextras to get PySpark and delta-spark.notebookutilsis only available inside Fabric — functions that resolve paths will raise a clearValueErroroutside Fabric.
Installation
# Inside a Fabric notebook or pipeline
pip install fabrictools
# Local development (includes PySpark + delta-spark)
pip install "fabrictools[spark]"
Quick start
import fabrictools as ft
Read a Lakehouse dataset
# Auto-detects Delta → Parquet → CSV
df = ft.read_lakehouse("BronzeLakehouse", "sales/2024")
Write to a Lakehouse
ft.write_lakehouse(
df,
lakehouse_name="SilverLakehouse",
relative_path="sales_clean",
mode="overwrite",
partition_by=["year", "month"], # optional
)
Merge (upsert) into a Delta table
ft.merge_lakehouse(
source_df=new_df,
lakehouse_name="SilverLakehouse",
relative_path="sales_clean",
merge_condition="src.id = tgt.id",
# update_set and insert_set are optional:
# omit them to update/insert all columns automatically
)
With explicit column mappings:
ft.merge_lakehouse(
source_df=new_df,
lakehouse_name="SilverLakehouse",
relative_path="sales_clean",
merge_condition="src.id = tgt.id",
update_set={"amount": "src.amount", "updated_at": "src.updated_at"},
insert_set={"id": "src.id", "amount": "src.amount", "updated_at": "src.updated_at"},
)
Read from a Warehouse
df = ft.read_warehouse("MyWarehouse", "SELECT * FROM dbo.sales WHERE year = 2024")
Write to a Warehouse
ft.write_warehouse(
df,
warehouse_name="MyWarehouse",
table="dbo.sales_clean",
mode="overwrite", # or "append"
batch_size=10_000, # optional, default 10 000
)
API reference
Lakehouse
| Function | Description |
|---|---|
read_lakehouse(lakehouse_name, relative_path, spark=None) |
Read a dataset — auto-detects Delta / Parquet / CSV |
write_lakehouse(df, lakehouse_name, relative_path, mode, partition_by, format, spark=None) |
Write a DataFrame (default: Delta, overwrite) |
merge_lakehouse(source_df, lakehouse_name, relative_path, merge_condition, update_set, insert_set, spark=None) |
Upsert via Delta merge |
Warehouse
| Function | Description |
|---|---|
read_warehouse(warehouse_name, query, spark=None) |
Run a SQL query, return a DataFrame |
write_warehouse(df, warehouse_name, table, mode, batch_size, spark=None) |
Write to a Warehouse table via JDBC |
How path resolution works
lakehouse_name="BronzeLakehouse"
│
▼
notebookutils.lakehouse.get("BronzeLakehouse")
│
▼
lh.properties.abfsPath
= "abfss://bronze@<account>.dfs.core.windows.net"
│
▼
full_path = abfsPath + "/" + relative_path
Running the tests
pip install "fabrictools[dev]"
pytest
Publishing to PyPI
See docs/PYPI_PUBLISH.md for a step-by-step guide.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fabrictools-0.1.0.tar.gz.
File metadata
- Download URL: fabrictools-0.1.0.tar.gz
- Upload date:
- Size: 11.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b9c34dfdaea4eebd5a8df4a82a4f05fcf24509f6d85ec2460ade1cddaa1a991
|
|
| MD5 |
a9eefc7da22a86e900fd868abe6ea7a7
|
|
| BLAKE2b-256 |
b6eb1270a6e9a3f7757c14ee8c51be456faef90dea3d46aafeabf76403420032
|
File details
Details for the file fabrictools-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fabrictools-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d47d8536ea258f94bf2852694870605f7a770f87c95ba8414aa5a43a0389dfd
|
|
| MD5 |
1a184dca0eb241e424d430db9b83a32d
|
|
| BLAKE2b-256 |
cd7405fd52ec68028d1144adf855b7350989c8d132ac86334fe139a648f34333
|