Skip to main content

A utility to export DuckLake database metadata to Delta Lake transaction logs.

Project description

DuckLake Delta Exporter

A Python utility to synchronize metadata from a DuckLake database with Delta Lake transaction logs. This allows you to manage data in DuckLake and make it discoverable and queryable by Delta Lake compatible tools (e.g., Spark, Delta Lake Rust/Python clients).

Features

DuckLake to Delta Sync: Generates incremental Delta Lake transaction logs (_delta_log/.json) and checkpoint files (_delta_log/.checkpoint.parquet) based on the latest state of tables in a DuckLake database.

Schema Mapping: Automatically maps DuckDB data types to their Spark SQL equivalents for Delta Lake schema definitions.

Change Detection: Identifies added and removed data files since the last Delta export, ensuring only necessary updates are written to the log.

Checkpointing: Supports creating Delta Lake checkpoint files at a configurable interval for efficient state reconstruction.

Installation

You can install this package using pip:

pip install ducklake-delta-exporter

Usage

from ducklake_delta_exporter import generate_latest_delta_log
generate_latest_delta_log('path/to/your/ducklake.db', data_root='/lakehouse/default/Tables', checkpoint_interval=1)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ducklake_delta_exporter-0.1.0.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ducklake_delta_exporter-0.1.0-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file ducklake_delta_exporter-0.1.0.tar.gz.

File metadata

  • Download URL: ducklake_delta_exporter-0.1.0.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for ducklake_delta_exporter-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c2f2e141c489c08044e5b1e8bb9857128f17321489adc7b0dd8eceda726dee9e
MD5 6991e7377e1a28d923061670d868315e
BLAKE2b-256 fbedf338fea2961da90e4b95c056475f4c00971c6faaa03a530f0b1b455dfa7f

See more details on using hashes here.

File details

Details for the file ducklake_delta_exporter-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ducklake_delta_exporter-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b417b3de138249f63030efbf51efea2132cc748899d96259c00744baab093ea
MD5 53ff8b35ba809e225442cf868a7d65e3
BLAKE2b-256 7eac193fae7ec24054f32bc2a71d0aeb2be5f362479b80774b9f2659eab0a38c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page