A SDK for syncing Databricks using Unity Catalog and Uniform
Project description
Databricks to Snowflake Table Mirroring
This repository provides a utility to synchronize (mirror) Iceberg table metadata from Databricks Unity Catalog to Snowflake Horizon.
It automates the creation of:
- Snowflake Catalog Integrations
- External Iceberg Tables
Note: This library uses credential vending to access cloud storage. Snowflake External Volumes are not required.
Table of Contents
- Overview
- Snowflake Setup
- Databricks Setup
- How to Use
- Configuration
- Parameter Reference
- Example Usage
- Limitations
Overview
This utility automates the following tasks:
- Retrieves Iceberg metadata from Unity Catalog
- Generates Delta-based metadata tables in Databricks
- Creates Catalog Integrations in Snowflake
- Creates External Iceberg Tables in Snowflake
Snowflake Setup
This utility supports two usage patterns:
- Manual: Generate DDLs for execution in Snowflake
- Automated: Create Snowflake assets directly from Databricks
Required Snowflake permissions:
- Create Catalog Integrations
- Create External Iceberg Tables
Databricks Setup
Install the library:
pip install databricks_uniform_sync
Initialize the class:
from databricks_uniform_sync import DatabricksToSnowflakeMirror
d2s = DatabricksToSnowflakeMirror(
spark_session=spark,
dbx_workspace_url="https://dbcxyz.databricks.cloud.net",
dbx_workspace_pat="dapi...",
metadata_catalog="dbx_sf_mirror_catalog",
metadata_schema="dbx_sf_mirror_schema"
)
How to Use
1. Create or Refresh Metadata Tables
d2s.create_metadata_tables()
d2s.refresh_metadata_tables(catalog="your_catalog")
These methods are idempotent and safe to rerun.
If metadata tables do not exist, refresh_metadata_tables() will create them.
2. Add Unity Catalog Discovery Tags
d2s.refresh_uc_metadata_tags()
These tags are used to determine sync eligibility. Do not remove them.
3. Create Snowflake Catalog Integrations
Dry run (SQL only):
d2s.generate_create_sf_catalog_integrations_sql(
oauth_client_id="client-id",
oauth_client_secret="client-secret"
)
Execute directly:
d2s.create_sf_catalog_integrations(
sf_account_id="xyz-123",
sf_user="svc_name",
sf_private_key_file="rsa/rsa_key.p8",
sf_private_key_file_pwd="your-password",
oauth_client_id="client-id",
oauth_client_secret="client-secret"
)
4. Create Iceberg Tables in Snowflake
Dry run:
d2s.generate_create_sf_iceberg_tables_sql()
Execute directly:
d2s.create_sf_iceberg_tables_sql(
sf_account_id="xyz-123",
sf_user="svc_name",
sf_private_key_file="rsa/rsa_key.p8",
sf_private_key_file_pwd="your-password"
)
Configuration
Custom Metadata Table Name
d2s = DatabricksToSnowflakeMirror(
spark_session,
dbx_workspace_url,
dbx_workspace_pat,
metadata_catalog,
metadata_schema,
metadata_table_name="custom_table_name"
)
A corresponding view will also be created with a _vw suffix.
Custom Refresh Interval
d2s.create_sf_catalog_integrations(
...,
refresh_interval_seconds=120
)
Disable Auto-Refresh on Iceberg Tables
d2s.create_sf_iceberg_tables_sql(
...,
auto_refresh=False
)
Parameter Reference
Databricks Parameters
| Parameter | Description |
|---|---|
spark_session |
Active SparkSession in Databricks |
dbx_workspace_url |
URL of your Databricks workspace |
dbx_workspace_pat |
Personal Access Token for authentication |
metadata_catalog |
Unity Catalog catalog to store metadata |
metadata_schema |
Unity Catalog schema to store metadata |
metadata_table_name (optional) |
Custom name for metadata table |
Snowflake Parameters
| Parameter | Description |
|---|---|
sf_account_id |
Snowflake account identifier |
sf_user |
Snowflake user/service account |
sf_private_key_file |
Path to RSA private key |
sf_private_key_file_pwd |
Password to decrypt RSA key |
oauth_client_id |
Databricks OAuth client ID |
oauth_client_secret |
Databricks OAuth client secret |
refresh_interval_seconds (optional) |
Catalog Integration refresh interval |
auto_refresh (optional) |
Enable/disable automatic refresh on tables |
Example Usage
Coming soon.
A demo notebook or script will be added to show end-to-end execution.
Limitations
- Only supports Iceberg tables on S3
- Deleting tables in Unity Catalog does not remove them in Snowflake
- Only supports RSA key pair authentication (Snowflake MFA compliance)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file databricks_uniform_sync-1.1.2.tar.gz.
File metadata
- Download URL: databricks_uniform_sync-1.1.2.tar.gz
- Upload date:
- Size: 19.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6a48bbd946c8f45728be0bbd6969529f40fcf61d1488b503abbb03046e9ca1d
|
|
| MD5 |
2dbe86eb9af2fa0c100cad7930d737c4
|
|
| BLAKE2b-256 |
5e71c76d245bddeec5658933d4b9d5a67e9f7aa7b18515cc837f6d5636d82616
|
File details
Details for the file databricks_uniform_sync-1.1.2-py3-none-any.whl.
File metadata
- Download URL: databricks_uniform_sync-1.1.2-py3-none-any.whl
- Upload date:
- Size: 27.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75098f0943b3af6338b4766fa12507baf6ae291f9a853bb8efa4c31be85ae8e2
|
|
| MD5 |
c687a66c727c81fca8fc03be7c91bdb5
|
|
| BLAKE2b-256 |
15a884c87f7b2c7212738f925215fbc104d7426dcc1f30ea91918e17fc865678
|