laken

Local Fabric table cache, lakehouse read/write, and deploy for modular Python on Microsoft Fabric

Project description

laken

The missing local development workflow for Microsoft Fabric.

laken lets you develop Python code for Fabric locally, using the tools you already trust.

Write code on your machine, run it against real Fabric lakehouse data.

When you're ready, laken deploy packages your project, publishes it to Fabric, and makes it available to your Fabric notebooks.

Your code stays modular. Your notebooks stay thin. And your local workflow survives contact with the platform.

Why “laken”?

Laken, pronounced LAH-kuhn, is Dutch for “cloth.” If you're feeling generous, it's a pun on Fabric and data lakes.

Installation

Install uv if needed, then add laken:

uv add laken

pip install laken

Deploy uses uv to build your wheel before publishing to a Fabric environment.

Quickstart

Write lakehouse code on your laptop against real Fabric data, package it, and run the same code in a notebook.

1. Credentials — create a .env in your project root (see Environment variables for the full list):

AZURE_TENANT_ID=...
AZURE_CLIENT_ID=...
AZURE_CLIENT_SECRET=...
FABRIC_WORKSPACE_NAME=MyWorkspace
FABRIC_LAKEHOUSE_NAME=MyLakehouse
FABRIC_WORKSPACE_ID=...
FABRIC_LAKEHOUSE_ID=...

2. Develop — reads pull from Fabric and cache locally. In a Fabric notebook that same code runs against your attached lakehouse:

from laken import Lakehouse

lh = Lakehouse()
df = lh.read_table("customers", frame_type="pandas")
# ...
lh.write_table(df, "customer_analytics")

3. Package and deploy — move that code into a normal Python package and publish it to a Fabric Environment (FABRIC_ENVIRONMENT_ID in .env):

customer_analytics/
├── pyproject.toml
└── src/customer_analytics/
    └── pipeline.py

# src/customer_analytics/pipeline.py
from laken import Lakehouse


def create_analytics(lh: Lakehouse) -> None:
    df = lh.read_table("customers", frame_type="pandas")
    # ...
    lh.write_table(df, "customer_analytics")

laken deploy

4. Run in a Fabric notebook — after the publish finishes:

from laken import Lakehouse
from customer_analytics.pipeline import create_analytics

lh = Lakehouse()
create_analytics(lh)

Usage

`Lakehouse`

Lakehouse() detects whether your code is running locally or in a Fabric notebook and connects accordingly. The same read_table / write_table calls work in both places:

Locally — the first read of a Fabric table copies it into a .laken/ folder on disk; later reads use that copy. Writes update only your local copy; they do not change tables in Fabric.
In a Fabric notebook — reads and writes go to your attached lakehouse.

from laken import Lakehouse

lh = Lakehouse()

Use schema.table when you need a schema (marketing.products). A bare name (products) is resolved by Fabric/Spark, usually as dbo.products on a schema-enabled lakehouse.

df = lh.read_table("products")                         # pandas locally; Spark in Fabric
df = lh.read_table("products", frame_type="spark")
df = lh.read_table("marketing.products", frame_type="polars")

lh.write_table(df, "products")
lh.write_table(df, "marketing.products", mode="append")

write_table replaces a table by default; pass mode="append" to add rows.

To use a different lakehouse than your .env or notebook default:

lh = Lakehouse(lakehouse="Sales_LH")

Fabric tables locally

The first time you read_table a Fabric table locally, laken downloads a copy into .laken/. Later reads use that copy.

write_table updates only that local copy — nothing is sent to Fabric. Run laken refresh <table> to discard local changes and download the table from Fabric again.

Tables up to 100 MB in Fabric are copied in full. Larger tables copy only the first 10,000 rows — enough to develop against without downloading the whole table. You can change both limits with max_mirror_mb and max_sample_rows on Lakehouse(...) or on a single read_table call:

lh = Lakehouse(max_mirror_mb=200, max_sample_rows=5_000)
lh.read_table("dbo.big_fact", max_mirror_mb=500)

CLI

laken deploy [--workspace-id <id>] [--environment-id <id>]
laken refresh <table>

laken deploy builds your project wheel from pyproject.toml, uploads it to a Fabric Environment, and starts a publish. Fabric rebuilds the environment in the background; import your package once that finishes.

laken refresh <table> replaces your local copy with the current table from Fabric. Use it when Fabric has newer data or when you want to undo local write_table changes. Tables you created locally that were never copied from Fabric are left alone.

Environment variables

When you create a Lakehouse or run a laken command, laken loads a .env file from your project root. Variables already set in your shell or CI take precedence. Call load_environment() yourself only if you need those values earlier.

Variable
`AZURE_TENANT_ID`	Azure AD tenant ID for your service principal
`AZURE_CLIENT_ID`	Application (client) ID of the service principal
`AZURE_CLIENT_SECRET`	Client secret for the service principal
`FABRIC_WORKSPACE_NAME`	Fabric workspace display name (required locally, with the other three name/ID vars)
`FABRIC_LAKEHOUSE_NAME`	Lakehouse display name to read from locally
`FABRIC_WORKSPACE_ID`	Workspace GUID for OneLake paths and deploy
`FABRIC_LAKEHOUSE_ID`	Lakehouse GUID for OneLake paths when reading locally
`FABRIC_ENVIRONMENT_ID`	Fabric Environment that `laken deploy` publishes to

AZURE_* values come from an Azure service principal. In a Fabric notebook you can copy the Fabric variables from context:

import notebookutils

context = notebookutils.runtime.context

FABRIC_WORKSPACE_NAME = context['currentWorkspaceName']
FABRIC_LAKEHOUSE_NAME = context.get('defaultLakehouseName')
FABRIC_WORKSPACE_ID = context['currentWorkspaceId']
FABRIC_LAKEHOUSE_ID = context.get('defaultLakehouseId')
FABRIC_ENVIRONMENT_ID = context.get('environmentId')

print(f"FABRIC_WORKSPACE_NAME={FABRIC_WORKSPACE_NAME}")
print(f"FABRIC_LAKEHOUSE_NAME={FABRIC_LAKEHOUSE_NAME}")
print(f"FABRIC_WORKSPACE_ID={FABRIC_WORKSPACE_ID}")
print(f"FABRIC_LAKEHOUSE_ID={FABRIC_LAKEHOUSE_ID}")
print(f"FABRIC_ENVIRONMENT_ID={FABRIC_ENVIRONMENT_ID}")

Logging

laken logs to stderr when you use Lakehouse or the CLI. Default level is INFO. To see more detail:

import logging

logging.getLogger("laken").setLevel(logging.DEBUG)

Development

Contributions are welcome. To work on this package:

uv sync
uv run pytest
uv run ruff check

Project details

Release history Release notifications | RSS feed

0.2.6

May 26, 2026

0.2.5

May 24, 2026

0.2.4

May 24, 2026

0.2.3

May 24, 2026

This version

0.2.2

May 24, 2026

0.2.1

May 24, 2026

0.1.5

May 22, 2026

0.1.4

May 22, 2026

0.1.3

May 21, 2026

0.1.2

May 21, 2026

0.1.1

May 21, 2026

0.1.0

May 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

laken-0.2.2.tar.gz (14.5 kB view details)

Uploaded May 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

laken-0.2.2-py3-none-any.whl (20.9 kB view details)

Uploaded May 24, 2026 Python 3

File details

Details for the file laken-0.2.2.tar.gz.

File metadata

Download URL: laken-0.2.2.tar.gz
Upload date: May 24, 2026
Size: 14.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for laken-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`08c23a2210aa147bdbde1f14d794b007df05d94735b29bfad940f92c0744030a`
MD5	`10aa1702fb937a440729850616c634c1`
BLAKE2b-256	`65587b1b8dcb4285fef12f87e4b1a725614013996bd4807856e99b5f9bc97880`

See more details on using hashes here.

File details

Details for the file laken-0.2.2-py3-none-any.whl.

File metadata

Download URL: laken-0.2.2-py3-none-any.whl
Upload date: May 24, 2026
Size: 20.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for laken-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e133422e93dab7558eaf931f084c32cd1436e6883b845ae2e27f728ab49e7a1b`
MD5	`332a72950e7586140d47404ea6354e8d`
BLAKE2b-256	`ed313ed94fbbb27fea4653f22d27d7f3ed57974ef8dd445a9eae939085253866`

See more details on using hashes here.

laken 0.2.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Why “laken”?

Installation

Quickstart

Usage

`Lakehouse`

Fabric tables locally

CLI

Environment variables

Logging

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes