Skip to main content

A Jupyter Kernel for DuckDB with Unity Catalog

Project description

Github Actions Status

Dunky

A Jupyter Kernel for DuckDB with Unity Catalog.

Dunky Demo

Description

Dunky is a Jupyter kernel that allows you to run DuckDB queries with Unity Catalog integration directly from your Jupyter notebooks.

I created this extension because existing solutions such as jupysql require you to use magics, load uc_catalog, delta, and manage secrets and don't work well with duckdb's uc_catalog extension.

Features

  • Run DuckDB queries in Jupyter notebooks
  • Unity Catalog integration
  • No need to use magics
  • Nice output formatting
  • No need to load uc_catalog, delta and manage secrets
  • CREATE EXTERNAL TABLE [table_name] LOCATION [location] OPTIONS [options] to create a Unity Catalog delta table

Installation

To install Dunky, you can use the following commands:

pip install dunky

Configure Unity Catalog

You can set the following environment variables to configure Unity Catalog:

  • UC_ENDPOINT: The endpoint of the Unity Catalog server.
  • UC_TOKEN: The token to authenticate with the Unity Catalog server.
  • UC_AWS_REGION: The AWS region to use for the Unity Catalog server.

These settings default to localhost:8080/api/2.1/unity-catalog, not-used, and eu-west-1 respectively.

Usage

After installing, you can start using the Dunky kernel in your Jupyter notebooks. Select the "Dunky" kernel from the kernel selection menu.

You can directly query DuckDB tables and use Unity Catalog features in your notebooks. You don't need to set up a connection or manage credentials, as Dunky handles all of that for you.

Start with attaching your database using the ATTACH DATABASE command. e.g.,

ATTACH DATABASE 'unity' AS unity (TYPE UC_CATALOG);

After attaching, just start writing your queries and enjoy the power of DuckDB with Unity Catalog integration!

S3 Integration

Dunky supports AWS S3 integration with Unity Catalog.

  • prerequisite:
    • Make sure the unity catalog has S3 bucket authentication configured
  • Writing to S3: in the CREATE EXTERNAL TABLE set location to s3://your-bucket-name

ps. Dunky might also work with gcp and azure, but have not tested this. depends on whether unity and duckdb uc_catalog support it. I've seen some people confirming that unity catalog and duckdb can work with Azure and gcp.

Example docker

In the docker folder, you can find an example of how to run JupyterLab with Dunky and Unity Catalog in Docker containers. To run the example, execute:

cd docker
docker compose up --build -d

token/password = dunky

If not already selected, you can find Dunky kernel in the kernel list.

Remarks

  • This kernel is still in development and may have some bugs.
  • This extension works well together with the junity extension.

Issues?

If you encounter any issues, please open an issue on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dunky-0.2.1.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

dunky-0.2.1-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file dunky-0.2.1.tar.gz.

File metadata

  • Download URL: dunky-0.2.1.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for dunky-0.2.1.tar.gz
Algorithm Hash digest
SHA256 09ed69a2bef91f48bf67bcfc9aa58bf2eae648a3be219fdd122edd3014e84252
MD5 6a73b614bedf10bdc2da8aaa4ad43b50
BLAKE2b-256 8a0111790c80364ab4a2179d61dd5499bd2e7162d16e07ee4aaac846fe1abacc

See more details on using hashes here.

Provenance

The following attestation bundles were made for dunky-0.2.1.tar.gz:

Publisher: build-and-publish.yml on dan1elt0m/dunky

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dunky-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: dunky-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for dunky-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 181f06f52afb8cb27775c61a718c70fdf7e1d4081e6f381853fc99ddcea87b29
MD5 a900f7e698485f26e2fb6a3a3d4f6755
BLAKE2b-256 ca687642765d746399e48c6c404a0c0f0a29a36cb6b1d1da38357c49c554e977

See more details on using hashes here.

Provenance

The following attestation bundles were made for dunky-0.2.1-py3-none-any.whl:

Publisher: build-and-publish.yml on dan1elt0m/dunky

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page