Skip to main content

No project description provided

Project description

Deltalake2DB

This is a simple project that uses Metadata from deltalake package to provide methods to read Delta Lake Tables to either Polars or DuckDB with better Protocol Support as the main deltalake package.

Use with Duckdb

Install deltalake2db and duckdb using pip/poetry/whatever you use.

Then you can do like this:

from deltalake2db import get_sql_for_delta,

with duckdb.connect() as con:
    dt = DeltaTable("tests/data/faker2")
    sql = get_sql_for_delta(dt, duck_con=con) # get select statement
    print(sql)
    duckdb_create_view_for_delta(con, dt, "delta_table") # or let it create a view for you. will point to the data at this point in time

    con.execute("select * from delta_table").fetch_all()

If you'd like to manipulate you can use get_sql_for_delta_expr which returns a SqlGlot Object

Use with Polars

Install deltalake2db and polars>=1.12 using pip/poetry/whatever you use.

dt = DeltaTable("tests/data/faker2")
from deltalake2db import polars_scan_delta
lazy_df = polars_scan_delta(dt)
df = lazy_df.collect()

Protocol Support

  • Column Mapping
  • Almost Data Types, including Structs/Lists, Map yet to be done
  • Test data types, including datetime
  • Deletion Vectors

In case there is an unsupported DeltaLake Feature, this will just throw DeltaProtocolError as does delta-rs

Cloud Support

For now, only az:// Url's for Azure are tested and supported in DuckDB. For polars it's a lot easier, since polars just uses object_store create, so it should just work.

The package does some work to make DuckDB's "Azure Storage Options" work in Polars, to be able to use the same options.

This means you can:

  • pass an absolute DuckDB-style Path to Polars, meaning something like abfss://⟨my_storage_account⟩.dfs.core.windows.net/⟨my_filesystem⟩/⟨path⟩
  • pass "chain" as option, which will act like DuckDB's Credential Chain. This requires azure-identity Package

Looking for something different? :)

We also have the following projects around deltalake:

Or projects from other people:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltalake2db-0.7.1.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

deltalake2db-0.7.1-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file deltalake2db-0.7.1.tar.gz.

File metadata

  • Download URL: deltalake2db-0.7.1.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for deltalake2db-0.7.1.tar.gz
Algorithm Hash digest
SHA256 b3ddc40a53993491b8a96e65e8ac8e79efe2757fa7c6c3cd002656e2296c182e
MD5 198e18601044e129f15acd3915ce6adc
BLAKE2b-256 54c52f44e604e1d78cdb589ba31b0777632820d02ee50e206c0ed117c36a7b0a

See more details on using hashes here.

Provenance

The following attestation bundles were made for deltalake2db-0.7.1.tar.gz:

Publisher: python-publish.yml on bmsuisse/deltalake2db

Attestations:

File details

Details for the file deltalake2db-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: deltalake2db-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for deltalake2db-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8271d3d33c43043e8bdd999cf9027b7624807eb6fdd5250f041a7aaa5a664645
MD5 f6d7c465e4a27e5ab9eed8293f5adb7c
BLAKE2b-256 c139199b6575daff92a54bb7c5e659b9d68de0d15a92584b6f94e2f8b7e54d6b

See more details on using hashes here.

Provenance

The following attestation bundles were made for deltalake2db-0.7.1-py3-none-any.whl:

Publisher: python-publish.yml on bmsuisse/deltalake2db

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page