databricks-connect compatibility & widgets for Marimo Notebooks

Project description

marimo-databricks-connect

This package provides compatibility & widgets for marimo notebooks & databricks. The goal is to be able to build notebooks that combine code (both python and sql), visualzations, and widgets to create "command center" style one-stop-shop UIs that can monitor, triage, troubleshoot and control our databricks projects.

Connect to databricks using databricks-connect & spark (not sql warehouse)
Authenticate/configure spark using the default databricks-connect process (env vars, .databrickscfg etc)
Execution of both python & sql cells
Autocomplete Catalog/Schema/Table/Column Names
Browsing of catalogs/schemas/tables/columns in the marimo data sources view
Browsing of external locations, volumes, dbfs, workspace in the marimo storage browser
Notebook widgets to monitor and control of specific instances of databricks capabilities (clusters, workflows, vector search, apps etc)
Widgets to browse & explore databricks capabilities (compute, workflows, unity catalog)

Why Marimo?

We already have databricks notebooks, jupyter, and python. Why should you try Marimo? Because it checks all the boxes:

Code/Format	Easy Merges	OSS Editor	Visualizations	Runs in Normal Python	REPL	Custom Widgets
Python	✅	✅	❌	✅	❌	❌
Databricks Notebook	✅	❌	✅	❌ (ignores magic and sql)	✅	❌
Jupyter	❌	✅	✅	❌	✅	✅
Marimo	✅	✅	✅	✅	✅	✅

Unfortunately, "out of the box", Marimo databricks support, especially for databricks-connect isn't great. This package aims enable all of the cool Marimo features for databricks

Pyspark

Dataframe

pyspark

Streaming

streaming

SQL

sql

Quickstart

Authenticate once on your machine:

az login
# or
databricks configure

Start Marimo (or use vscode extension)

marimo edit mynotebook.py
# or
marimo new

Then in any notebook in this folder:

import marimo as mo
from marimo_databricks_connect import (
    dbfs, dbutils, external_location, spark, workspace, 
    exclude_catalogs, include_catalogs, show_all_catalogs,
    workflows_widget, compute_widget, unity_catalog_widget
)

That single import gives you:

spark — a DatabricksSession on serverless compute (OAuth, no host/token config).
dbutils — bound to that session.
external_location - Add external locations to browse in the UI
include/exclude_catalogs - Show/Hide catalogs in the datasource UI
dbfs — an fsspec filesystem rooted at /Volumes that powers the marimo storage browser via Unity Catalog (no direct ADLS access).
workspace - filesystem browser for the workspace
A registered SparkConnectEngine so marimo's data sources panel browses catalogs / schemas / tables, and SQL cells run on Spark when you pass engine=spark:
```
mo.sql("SELECT * FROM samples.nyctaxi.trips LIMIT 100", engine=spark)
```
SQL autocomplete — the engine feeds marimo's in-cell SQL completion with catalogs, schemas, tables, and columns. Discovery is done in bulk via <catalog>.information_schema (one query per catalog instead of N SHOW/DESCRIBE round trips) and cached in-process. Call prefetch() at the top of a notebook to warm the cache eagerly so suggestions appear on the first keystroke:
```
from marimo_databricks_connect import include_catalogs, prefetch, refresh_metadata

include_catalogs("main", "samples")  # narrow scope (also makes columns eager)
prefetch()                             # populate cache for everything visible
# refresh_metadata("main")            # drop cache after schema changes
```
Streaming DataFrame support — streaming DataFrames (from spark.readStream) are automatically rendered with their schema and a helpful status message instead of silently failing.
StreamingQuery display — streaming queries (from .writeStream.start()) render a live status card with query name, ID, active state, progress metrics, and any exceptions.

Streaming DataFrames

Streaming DataFrames (spark.readStream) cannot be collected or displayed as tables. This package automatically detects them and renders a schema summary with column names and types:

stream = spark.readStream.table("catalog.schema.my_table")
stream  # displays schema + STREAMING badge instead of an empty cell

Streaming queries (returned by .writeStream.start()) are also rendered with a status card showing the query name, ID, active/stopped state, progress metrics (batch ID, input rows, rows/sec), source and sink info, and any exceptions:

query = (
    stream.writeStream
    .format("memory")
    .trigger(availableNow=True)
    .queryName("preview")
    .start()
)
query  # displays status card with ACTIVE/STOPPED badge + progress

To preview actual data from a streaming source, write to a memory sink and read the results:

query.awaitTermination()  # wait for availableNow trigger to finish
spark.table("preview")    # now displays as a normal table

Browsing UC external locations

Add a cell to expose another root in the storage browser:

from marimo_databricks_connect import external_location

landing = external_location("finops_landing")                  # by UC name
raw     = external_location("abfss://c@acct.dfs.core.windows.net/data")  # by path

storage

Each variable shows up as its own tree in the storage panel.

Filtering the data sources panel (catalogs / schemas)

With 1000+ UC catalogs the panel becomes unusable. By default only the current catalog (SELECT current_catalog()) is surfaced. Add catalogs (or specific schemas) explicitly with fnmatch globs:

from marimo_databricks_connect import (
    include_catalogs, exclude_catalogs, show_all_catalogs, reset_catalog_filter,
)

include_catalogs("main", "samples")            # exact names
include_catalogs("dev_*", "*_prod")             # globs
include_catalogs("main.bronze_*", "*_dev.silver")  # narrow to specific schemas

exclude_catalogs("system", "__databricks_*")    # always wins over includes

show_all_catalogs()                             # opt out of the allow-list
reset_catalog_filter()                          # back to defaults

Filtering only affects the data sources panel — mo.sql(..., engine=spark) and spark.sql(...) can still query any catalog you have UC permission for.

Persistent defaults

Set once per project in pyproject.toml:

[tool.marimo_databricks_connect]
include_catalogs = ["main", "dev_*"]
exclude_catalogs = ["system", "__databricks_internal"]
# show_all_catalogs = true

…or per shell with environment variables (these override pyproject.toml):

export MARIMO_DBC_INCLUDE_CATALOGS="main,dev_*"
export MARIMO_DBC_EXCLUDE_CATALOGS="system"
export MARIMO_DBC_SHOW_ALL_CATALOGS=1

catalogs

Resource Specific Widgets

Databricks Apps

dbr app

Cluster

cluster

Job

job

Schema

schema

Genie

Chat with a Databricks AI/BI Genie space — ask natural-language questions, get back text answers and generated SQL, run the queries inline, and follow suggested next questions. Browse and resume past conversations.

from marimo_databricks_connect import genie_widget
widget = genie_widget("01ef...space_id...")
widget

genie

Serving Endpoint

serving

Table

table

Vector Index

vector index

Vector Search

vector search

Warehouse

warehouse

Selector widgets (`mdc.ui.*`)

First-class mo.ui-style selectors for every Databricks resource. Each one is a searchable dropdown whose .value traitlet plugs straight into marimo's reactive graph — picking a different option re-runs every cell that reads it, just like mo.ui.dropdown:

import marimo as mo
import marimo_databricks_connect as mdc

catalog = mdc.ui.catalog()
schema  = mdc.ui.schema(catalog=catalog)        # auto-refreshes when catalog changes
table   = mdc.ui.table(schema=schema)
column  = mdc.ui.column(table=table)

mo.hstack([catalog, schema, table, column])

Then in any downstream cell:

spark.table(table.value).select(column.value).limit(20)

Available selectors (all under mdc.ui):

Factory	`value` is...
`mdc.ui.catalog()`	catalog name
`mdc.ui.schema(catalog=...)`	`catalog.schema`
`mdc.ui.table(schema=...)`	`catalog.schema.table`
`mdc.ui.column(table=...)`	column name
`mdc.ui.secret_scope()`	scope name
`mdc.ui.secret(scope=...)`	secret key (with `{{secrets/...}}` ref in `selected_meta`)
`mdc.ui.cluster()`	cluster id
`mdc.ui.warehouse()`	warehouse id
`mdc.ui.workflow()`	job id (str)
`mdc.ui.pipeline()`	DLT pipeline id
`mdc.ui.app()`	app name
`mdc.ui.serving_endpoint()`	endpoint name
`mdc.ui.vector_search()`	Vector Search endpoint name (alias: `vector_search_endpoint`)
`mdc.ui.vector_index(endpoint=...)`	three-part index name
`mdc.ui.genie_space()`	Genie space id
`mdc.ui.principal()`	userName / applicationId / group displayName

Dependent selectors (schema, table, column, secret, vector_index) accept either a literal string parent or another selector — when given a selector they observe its .value and refetch automatically. All selectors also expose .selected_meta (parsed dict with extra metadata), .options (synced JSON list), a refresh button in the UI, and a refresh() method.

Exploration Widgets

The package ships two interactive widgets built with anywidget for exploring your Databricks workspace directly inside marimo notebooks.

Unity Catalog widget

Browse catalogs, schemas, tables, columns, volumes, and more. Inspect table details, view sample data, explore table & column lineage, and check permissions. Also browse external locations (with drill-through into their contents), storage credentials, connections, and external metadata:

from marimo_databricks_connect import unity_catalog_widget

widget = unity_catalog_widget()
widget  # display in cell output

Workflows widget

Browse jobs, drill into tasks, and view run history:

from marimo_databricks_connect import workflows_widget

widget = workflows_widget()
widget  # display in cell output

workflows

Compute widget

Browse clusters, SQL warehouses, vector search endpoints, instance pools, and cluster policies in a tabbed interface:

from marimo_databricks_connect import compute_widget

widget = compute_widget()
widget  # display in cell output

All widgets authenticate using the default Databricks auth chain (env vars, ~/.databrickscfg, az login, etc.) when no explicit client is provided.

compute

Running

marimo edit scratch/m.py

Project details

Release history Release notifications | RSS feed

0.1.42

May 2, 2026

0.1.41

May 2, 2026

0.1.40

May 2, 2026

0.1.39

May 2, 2026

0.1.38

May 2, 2026

This version

0.1.37

May 2, 2026

0.1.36

May 2, 2026

0.1.35

May 2, 2026

0.1.34

May 2, 2026

0.1.33

May 2, 2026

0.1.32

May 2, 2026

0.1.31

May 1, 2026

0.1.30

May 1, 2026

0.1.29

May 1, 2026

0.1.28

May 1, 2026

0.1.27

May 1, 2026

0.1.26

May 1, 2026

0.1.25

May 1, 2026

0.1.24

Apr 30, 2026

0.1.23

Apr 30, 2026

0.1.22

Apr 30, 2026

0.1.21

Apr 30, 2026

0.1.20

Apr 30, 2026

0.1.19

Apr 29, 2026

0.1.18

Apr 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marimo_databricks_connect-0.1.37.tar.gz (1.8 MB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

marimo_databricks_connect-0.1.37-py3-none-any.whl (308.1 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file marimo_databricks_connect-0.1.37.tar.gz.

File metadata

Download URL: marimo_databricks_connect-0.1.37.tar.gz
Upload date: May 2, 2026
Size: 1.8 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for marimo_databricks_connect-0.1.37.tar.gz
Algorithm	Hash digest
SHA256	`dd569178bbbbf8b9a6ffe3b1e22f7da54254779fc6d4788bb275066192a42a88`
MD5	`0dbac53e9be208ac99a569e3a8c45144`
BLAKE2b-256	`0566ac61ae875100879b05abdb4ad44796eced36f3897f55f71136ac46e7f8b4`

See more details on using hashes here.

File details

Details for the file marimo_databricks_connect-0.1.37-py3-none-any.whl.

File metadata

Download URL: marimo_databricks_connect-0.1.37-py3-none-any.whl
Upload date: May 2, 2026
Size: 308.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for marimo_databricks_connect-0.1.37-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6542dcd145428f3785ced03d8c896da52e63f19db0956eabd25bf0a6ba320bfa`
MD5	`1c631193dab9f58d6c41a428fcbc5979`
BLAKE2b-256	`78680ba461d41cfa3603ad6604e4c6338a8243afcc12e49b85560a59d0f110f9`

See more details on using hashes here.

marimo-databricks-connect 0.1.37

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

marimo-databricks-connect

Why Marimo?

Pyspark

Dataframe

Streaming

SQL

Quickstart

Streaming DataFrames

Browsing UC external locations

Filtering the data sources panel (catalogs / schemas)

Persistent defaults

Resource Specific Widgets

Databricks Apps

Cluster

Job

Schema

Genie

Serving Endpoint

Table

Vector Index

Vector Search

Warehouse

Selector widgets (mdc.ui.*)

Exploration Widgets

Unity Catalog widget

Workflows widget

Compute widget

Running

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Selector widgets (`mdc.ui.*`)