Thin Python wrapper for reading Delta tables from Azure Blob Storage with low and stable latency.
Project description
deltabridge
Thin Python wrapper for reading Delta tables from object storage (currently Azure Blob Storage) or a local filesystem, with low and stable latency. Optimized for repeated reads from long-running Python services. A typical use case is exposing the final products of a data pipeline via a REST API, where request latency should stay predictable.
Note: The efficiency is achieved by using Rust-based loading of Delta tables through delta-rs and automatic incremental caching of Delta transaction logs.
Installation
pip install deltabridge
Or, with uv:
uv add deltabridge
Usage
Examples
Azure
import os
import deltalake
import polars as pl
from deltabridge import PartitionFilterOperator
from deltabridge.azure import AzureDeltaClient
azure_delta_client = AzureDeltaClient()
table_client = azure_delta_client.get_table_client(
table_uri=os.environ['MY_TABLE_STORAGE_URI'],
)
# Get a DeltaTable instance
delta_table: deltalake.DeltaTable = table_client.load_as_delta()
# Load the data as a Polars LazyFrame
table_ldf: pl.LazyFrame = table_client.load_as_polars()
# Collect to a Polars DataFrame
table_df: pl.DataFrame = table_ldf.filter(pl.col('x') > 3).collect()
# For partitioned tables, push filters down to the partition columns so that
# only matching partitions are read from storage (avoiding a full scan).
# Multiple partition filters are combined using the logical AND operator.
table_df = table_client.load_as_polars(
partition_filter=[
('country', PartitionFilterOperator.IN, ['CZ', 'SK']),
('year', PartitionFilterOperator.EQUAL, '2024'),
],
).collect()
Local filesystem
import polars as pl
from deltabridge.local import LocalDeltaClient
MY_TABLE_PATH = '/tmp/my_table'
# Write a table to a local filesystem
pl.DataFrame({'x': [1, 2, 3]}).write_delta(
target=MY_TABLE_PATH
)
local_delta_client = LocalDeltaClient()
table_client = local_delta_client.get_table_client(
table_uri=MY_TABLE_PATH # File path can be used as table URI
)
# Load the data as a Polars LazyFrame and collect it into a DataFrame
table_df = table_client.load_as_polars().collect()
print(table_df)
Databricks tables
If your Delta tables are managed by Databricks (Unity Catalog), they are still stored as ordinary Delta tables in object storage. Deltabridge can read them directly from the storage, so you can access them without a Databricks SQL warehouse or cluster:
- Use the table's storage location (in Azure Blob Storage) as the table URI.
- You can find it in the Databricks Catalog Explorer UI under Details of the table.
- The reading identity needs at least the Storage Blob Data Reader permission on the storage location (storage account/container).
Writing to Delta tables
deltabridge is read-focused: it provides no write API, and its optimizations don't apply to writes. This is deliberate:
- write use cases are more varied and harder to abstract well - appends, overwrites, merges/upserts, schema evolution and concurrency control all behave differently
- writes are typically handled upstream by the systems that produce the tables (often Spark/PySpark pipelines)
Writing is still possible: load_as_delta() returns a deltalake.DeltaTable with deltabridge's auth already configured, which you can pass to deltalake's write API:
import deltalake
deltalake.write_deltalake(table_client.load_as_delta(), df, mode='append')
Cloud provider support
Object storage support currently covers Azure Blob Storage (plus the local filesystem).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deltabridge-1.0.0.tar.gz.
File metadata
- Download URL: deltabridge-1.0.0.tar.gz
- Upload date:
- Size: 55.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
099ca8eefd1cceae9bc75a5a49f3120be0f5265f93a4c12edad97543c726658e
|
|
| MD5 |
3ecf6ed13549bd9cd7af0e0bf6d4593b
|
|
| BLAKE2b-256 |
9ee5b574f85415ef377ca353a16fb479eb4e31ea6276d6a9ae44fde9b1c7c38f
|
Provenance
The following attestation bundles were made for deltabridge-1.0.0.tar.gz:
Publisher:
release.yaml on datamole-ai/deltabridge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
deltabridge-1.0.0.tar.gz -
Subject digest:
099ca8eefd1cceae9bc75a5a49f3120be0f5265f93a4c12edad97543c726658e - Sigstore transparency entry: 1800317091
- Sigstore integration time:
-
Permalink:
datamole-ai/deltabridge@6deb06f8c94b08d0ef550c6aa17b03126bf7b574 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/datamole-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@6deb06f8c94b08d0ef550c6aa17b03126bf7b574 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file deltabridge-1.0.0-py3-none-any.whl.
File metadata
- Download URL: deltabridge-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bdf9edf796c604493ccb063e80dc53130c44d4fc0abe7008418ef9aea887cd28
|
|
| MD5 |
e0adee88b4a0ab4c4f9d9e052480415d
|
|
| BLAKE2b-256 |
8b686362520bc5b46c55bea603460510360e984d82cd0d209aafd070752d8189
|
Provenance
The following attestation bundles were made for deltabridge-1.0.0-py3-none-any.whl:
Publisher:
release.yaml on datamole-ai/deltabridge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
deltabridge-1.0.0-py3-none-any.whl -
Subject digest:
bdf9edf796c604493ccb063e80dc53130c44d4fc0abe7008418ef9aea887cd28 - Sigstore transparency entry: 1800317150
- Sigstore integration time:
-
Permalink:
datamole-ai/deltabridge@6deb06f8c94b08d0ef550c6aa17b03126bf7b574 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/datamole-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@6deb06f8c94b08d0ef550c6aa17b03126bf7b574 -
Trigger Event:
workflow_dispatch
-
Statement type: