A Jupyter Kernel for DuckDB with Unity Catalog
Project description
Dunky
A Jupyter Kernel for DuckDB with Unity Catalog.
Description
Dunky is a Jupyter kernel that allows you to run DuckDB queries with Unity Catalog integration directly from your Jupyter notebooks.
I created this extension because existing solutions such as jupysql
require you to use magics, load uc_catalog, delta, and manage secrets
and don't work well with duckdb's uc_catalog extension.
Features
- Run DuckDB queries in Jupyter notebooks
- Unity Catalog integration
- No need to use magics
- Nice output formatting
- No need to load uc_catalog, delta and manage secrets
- CREATE EXTERNAL TABLE [table_name] LOCATION [location] OPTIONS [options] to create a Unity Catalog delta table
Installation
To install Dunky, you can use the following commands:
pip install dunky
Configure Unity Catalog
You can set the following environment variables to configure Unity Catalog:
UC_ENDPOINT
: The endpoint of the Unity Catalog server.UC_TOKEN
: The token to authenticate with the Unity Catalog server.UC_AWS_REGION
: The AWS region to use for the Unity Catalog server.
These settings default to localhost:8080/api/2.1/unity-catalog, not-used, and eu-west-1 respectively.
Usage
After installing, you can start using the Dunky kernel in your Jupyter notebooks. Select the "Dunky" kernel from the kernel selection menu.
You can directly query DuckDB tables and use Unity Catalog features in your notebooks. You don't need to set up a connection or manage credentials, as Dunky handles all of that for you.
Start with attaching your database using the ATTACH DATABASE command. e.g.,
S3 Integration
Dunky supports AWS S3 integration with Unity Catalog.
- prerequisite:
- Make sure the unity catalog has S3 bucket authentication configured
- Writing to S3: in the CREATE EXTERNAL TABLE set location to
s3://your-bucket-name
ATTACH DATABASE 'unity' AS unity (TYPE UC_CATALOG);
After attaching, just start writing your queries and enjoy the power of DuckDB with Unity Catalog integration!
ps. Dunky might also work with gcp and azure, but have not tested this. depends on whether unity and duckdb uc_catalog support it. I've seen some people confirming that unity catalog and duckdb can work with Azure and gcp.
Example docker
In the docker
folder, you can find an example of how to run JupyterLab with Dunky and Unity Catalog in Docker containers.
To run the example, execute:
cd docker
docker compose up --build -d
Remarks
- This kernel is still in development and may have some bugs.
- This extension works well together with the junity extension.
Issues?
If you encounter any issues, please open an issue on the GitHub repository.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dunky-0.1.3.tar.gz
.
File metadata
- Download URL: dunky-0.1.3.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97253e97fcc82ea951eede5bbca5b3d3bf2bb72f153d55bbeae666dbac3ee712 |
|
MD5 | b13cd1aaff1820584b651462100f69ba |
|
BLAKE2b-256 | 9f1aae4d20fc0dec3164e5c06a5f4581c649f80c82380b7f07a4429d592cf0d7 |
File details
Details for the file dunky-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: dunky-0.1.3-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a9a0f1a194189050f4e47569b87efcdf932dfb66c2f3b18b48028cbe264797b |
|
MD5 | 2d52e82adfdb0555041651c223a8ed47 |
|
BLAKE2b-256 | f533aa4ef881e4725d36544d6ecae97d1a525bc9dbd0cfe325c7514d76a9e708 |