Skip to main content

DuckLake connection utilities for DuckBricks notebooks and pipelines

Project description

duckbricks-utils

DuckLake connection utilities for DuckBricks notebooks and pipelines.

Provides a consistent interface for connecting to a DuckLake catalog backed by PostgreSQL, from any environment — local IDE, Marimo notebooks, or job executors. Supports multiple storage backends (local, S3, MinIO, Cloudflare R2, GCS, Azure Blob).

Installation

pip install duckbricks-utils

Quick start

from duckbricks_utils import connect

conn = connect()
result = conn.execute("SELECT * FROM my_table LIMIT 10").df()

Configuration

All settings are read from environment variables (a .env file is supported via python-dotenv).

PostgreSQL catalog

Variable Default Description
DUCKLAKE_PG_HOST localhost PostgreSQL host
DUCKLAKE_PG_PORT 5432 PostgreSQL port
DUCKLAKE_PG_DATABASE duckbricks PostgreSQL database name
DUCKLAKE_PG_USER duckbricks PostgreSQL user
DUCKLAKE_PG_PASSWORD duckbricks PostgreSQL password
DUCKBRICKS_DUCKLAKE_NAME duckbricks DuckLake catalog name

Storage backend

Set DUCKBRICKS_STORAGE_BACKEND to one of the supported values (default: local).

Backend DUCKBRICKS_STORAGE_BACKEND
Local filesystem local
Amazon S3 s3
MinIO minio
Cloudflare R2 r2
Google Cloud Storage gcs
Azure Blob Storage azure

Each backend reads its own credentials from environment variables (e.g. AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY for S3).

Local backend

Variable Default Description
DUCKBRICKS_DATA_PATH /data/parquet/ Parquet storage path

API reference

from duckbricks_utils import connect, catalog_name, data_path
from duckbricks_utils import StorageBackend, StorageBackendFactory

# Open a DuckDB connection with the DuckLake catalog attached
conn = connect()

# Open a connection with a custom data path override
conn = connect(override_data_path="/tmp/my_data/")

# Get the configured catalog name
name = catalog_name()

# Get the active backend's data path
path = data_path()

# Resolve the backend from environment
backend = StorageBackendFactory.from_env()

# List all supported backend names
backends = StorageBackendFactory.supported_backends()

Requirements

  • Python ≥ 3.11
  • DuckDB ≥ 1.3.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duckbricks_utils-0.1.1.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

duckbricks_utils-0.1.1-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file duckbricks_utils-0.1.1.tar.gz.

File metadata

  • Download URL: duckbricks_utils-0.1.1.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.13.2 Darwin/25.4.0

File hashes

Hashes for duckbricks_utils-0.1.1.tar.gz
Algorithm Hash digest
SHA256 406a82d1926311a1fb2c7ea8cd27f21249413c7bc79bd09e69f7a92bd20ba99f
MD5 e8673a150daaeac26f790eb69eb0466d
BLAKE2b-256 265b464784e4a753594fc109e997423b5cd11d9087d923c10a8552617b9f54e8

See more details on using hashes here.

File details

Details for the file duckbricks_utils-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: duckbricks_utils-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.13.2 Darwin/25.4.0

File hashes

Hashes for duckbricks_utils-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fb31e6c046a9ed5b697a1c2b27a413b186c6b956c03f5955ece3230ef48a0c07
MD5 6eb405817c3ab8a7a818128346327208
BLAKE2b-256 f2e3a3c4e2f67d8ccbff874384930dde4a4659ce261690b23d3fceeacf934b84

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page