Skip to main content

MCP server for Databricks development and operations

Project description

Databricks Utils MCP Server

An MCP (Model Context Protocol) server for Databricks development and operations. Compatible with any MCP client: Claude Code, Claude Desktop, Cursor, and others.

Covers ten areas:

  • Unity Catalog -- browse catalogs, schemas, tables, volumes, and functions; describe table schemas; sample table data
  • SQL -- execute SQL statements; manage SQL warehouses (list, start, stop)
  • Query History -- list recent queries; get full query details
  • Jobs -- list, inspect, trigger, cancel, and repair job runs
  • Clusters -- list (with filtering and paging), inspect, start, and terminate clusters
  • Pipelines -- list, inspect, start, and stop Delta Live Tables pipelines
  • Workspace -- browse workspace directories; export notebooks
  • Files -- list and read files from DBFS and Unity Catalog Volumes
  • Secrets -- list secret scopes and key names (values are never returned)
  • Permissions -- get Unity Catalog grants and workspace object ACLs

Authentication uses the Databricks SDK's standard credential chain, which auto-discovers credentials from ~/.databrickscfg profiles, environment variables, or Azure CLI. Raw tokens are never passed as tool parameters. See Authentication below.

Requirements

  • uv
  • A Databricks workspace with a personal access token, OAuth config, or Azure CLI session

Installation

macOS

brew install uv

Linux

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows

winget install --id=astral-sh.uv

Configuration

Claude Code users:

claude mcp add --scope user databricks-utils -- uvx databricks-utils-mcp

For other MCP clients, add the following to your server configuration:

{
  "mcpServers": {
    "databricks-utils": {
      "command": "uvx",
      "args": ["databricks-utils-mcp"]
    }
  }
}

Restart your MCP client after adding the server.

Installing from source

git clone https://github.com/BrianDeacon/databricks-utils-mcp
cd databricks-utils-mcp
uv sync

Then configure with the cloned path:

{
  "mcpServers": {
    "databricks-utils": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/databricks-utils-mcp", "databricks-utils-mcp"]
    }
  }
}

Authentication

All tools use the Databricks SDK's unified authentication. By default, the SDK checks (in order): environment variables, ~/.databrickscfg default profile, Azure CLI, and other standard credential sources.

Every tool accepts three optional parameters to override the default:

Parameter Description
profile Name of a ~/.databrickscfg profile (provides both host and credentials)
host Workspace URL override (credentials still resolved via the standard chain)
token_env_var Name of an environment variable holding a PAT. The tool reads the value locally; the token itself is never sent as a parameter.

Example ~/.databrickscfg

[DEFAULT]
host  = https://adb-1234567890.0.azuredatabricks.net/
token = dapi...

[staging]
host      = https://adb-0987654321.0.azuredatabricks.net/
auth_type = databricks-cli

With this config, tools use the DEFAULT profile automatically. Pass profile="staging" to target the staging workspace instead.


Unity Catalog Tools

catalog_list_catalogs

List all Unity Catalog catalogs accessible to the current user. Returns a sorted JSON array of catalog names.

catalog_list_schemas

Parameter Type Required Description
catalog string yes Catalog name

catalog_list_tables

Parameter Type Required Description
catalog string yes Catalog name
schema string yes Schema name

catalog_describe_table

Returns columns (name, type, comment), table type, storage location, properties, and timestamps.

Parameter Type Required Description
full_name string yes Three-part name: catalog.schema.table

catalog_get_table_sample

Executes SELECT * LIMIT against a table using a SQL warehouse. Returns a JSON array of row objects.

Parameter Type Required Description
full_name string yes Three-part name: catalog.schema.table
limit integer no Number of rows (default 10, max 100)
warehouse_id string no SQL warehouse ID. If omitted, uses the first running warehouse.

catalog_list_volumes

Parameter Type Required Description
catalog string yes Catalog name
schema string yes Schema name

catalog_list_volume_files

Parameter Type Required Description
volume_path string yes Path under /Volumes/catalog/schema/volume/

catalog_list_functions

Parameter Type Required Description
catalog string yes Catalog name
schema string yes Schema name

SQL Tools

sql_execute

Execute a SQL statement against a Databricks SQL warehouse. Returns a JSON array of row objects for queries, or a status message for DDL/DML.

Parameter Type Required Description
statement string yes SQL statement to execute
warehouse_id string no SQL warehouse ID. If omitted, uses the first running warehouse.
max_rows integer no Maximum rows to return (default 100, cap 10,000)
catalog string no Default catalog for unqualified names
schema string no Default schema for unqualified names

sql_list_warehouses

List all SQL warehouses. Returns a JSON array with id, name, state, cluster_size, and auto_stop_mins.

sql_get_warehouse

Parameter Type Required Description
warehouse_id string yes SQL warehouse ID

sql_start_warehouse

Start a stopped SQL warehouse. Does not wait for it to finish starting.

Parameter Type Required Description
warehouse_id string yes SQL warehouse ID

sql_stop_warehouse

Stop a running SQL warehouse.

Parameter Type Required Description
warehouse_id string yes SQL warehouse ID

Query History Tools

query_history_list

List recent SQL queries. Returns query ID, statement (truncated), status, duration, rows produced, user, and warehouse.

Parameter Type Required Description
warehouse_id string no Filter to a specific warehouse
user_name string no Filter by user
status string no Filter by status: FINISHED, FAILED, CANCELED, RUNNING
max_results integer no Max results (default 25, cap 100)

query_history_get

Get full details for a specific query including the complete statement text, metrics, and error message if failed.

Parameter Type Required Description
query_id string yes Query ID

Jobs Tools

jobs_list

Parameter Type Required Description
name string no Filter by job name (substring match)
max_results integer no Max results (default 25)

jobs_get

Returns tasks, schedule, clusters, parameters, tags, and notifications.

Parameter Type Required Description
job_id integer yes Job ID

jobs_list_runs

Parameter Type Required Description
job_id integer no Filter to a specific job
active_only boolean no Only show active (in-progress) runs (default false)
max_results integer no Max results (default 25)

jobs_get_run

Returns per-task states, start/end times, cluster info, error messages, and attempt number.

Parameter Type Required Description
run_id integer yes Run ID

jobs_get_run_output

Returns notebook output, error trace, and logs depending on task type. Only works for single-task runs.

Parameter Type Required Description
run_id integer yes Run ID

jobs_run

Trigger a job run. Does not wait for completion. Returns the run_id for tracking.

Parameter Type Required Description
job_id integer yes Job ID
parameters object no Parameter overrides as a key/value map

jobs_cancel_run

Parameter Type Required Description
run_id integer yes Run ID

jobs_repair_run

Re-run failed tasks in a multi-task job run.

Parameter Type Required Description
run_id integer yes Run ID of a failed multi-task job
rerun_tasks array no Specific task keys to re-run. If omitted, re-runs all failed tasks.

Cluster Tools

clusters_list

Results are paged (default 20 per page). The response includes next_page_token and prev_page_token when more pages are available.

Parameter Type Required Description
cluster_sources string no Comma-separated filter: UI, API, JOB
cluster_states string no Comma-separated filter: RUNNING, TERMINATED, PENDING, etc.
is_pinned boolean no If true, only return pinned clusters
page_size integer no Clusters per page (default 20)
page_token string no Token from a previous response for next/previous page

clusters_get

Parameter Type Required Description
cluster_id string yes Cluster ID

clusters_get_events

Returns recent events including termination reasons.

Parameter Type Required Description
cluster_id string yes Cluster ID
max_results integer no Max events (default 25)

clusters_start

Start a terminated cluster. Does not wait for it to finish starting.

Parameter Type Required Description
cluster_id string yes Cluster ID

clusters_terminate

Terminate a running cluster. This stops the cluster but does not delete it.

Parameter Type Required Description
cluster_id string yes Cluster ID

Pipeline Tools

pipelines_list

Parameter Type Required Description
name string no Filter by pipeline name (substring match)
max_results integer no Max results (default 25)

pipelines_get

Returns target catalog/schema, clusters, libraries, and notifications.

Parameter Type Required Description
pipeline_id string yes Pipeline ID

pipelines_list_events

Returns events including update progress, errors, and data quality metrics.

Parameter Type Required Description
pipeline_id string yes Pipeline ID
max_results integer no Max events (default 25)

pipelines_start

Start a pipeline update. Returns the update_id for tracking.

Parameter Type Required Description
pipeline_id string yes Pipeline ID
full_refresh boolean no Full refresh instead of incremental (default false)

pipelines_stop

Parameter Type Required Description
pipeline_id string yes Pipeline ID

Workspace Tools

workspace_list

List objects in a workspace directory. Returns path, object_type (NOTEBOOK, DIRECTORY, FILE, REPO), and language (for notebooks).

Parameter Type Required Description
path string yes Workspace path (e.g. /Users/user@example.com)

workspace_export_notebook

For SOURCE format, returns the notebook content as text. For other formats, returns base64-encoded content.

Parameter Type Required Description
path string yes Notebook path in workspace
format string no Export format: SOURCE (default), HTML, JUPYTER, DBC

Files Tools

files_list

List files in a DBFS or Volumes path.

Parameter Type Required Description
path string yes dbfs:/... for DBFS, /Volumes/... for Unity Catalog Volumes

files_read

Read a file from Volumes or DBFS. Returns text content for text files, or a size summary for binary files.

Parameter Type Required Description
path string yes Path to the file
max_bytes integer no Max bytes to read (default 1 MB, cap 10 MB)

Secrets Tools

secrets_list_scopes

List all secret scopes in the workspace. Returns a sorted JSON array of scope names.

secrets_list_keys

List secret key names in a scope. Values are never returned.

Parameter Type Required Description
scope string yes Secret scope name

Permissions Tools

permissions_get_grants

Get Unity Catalog grants on a securable object. Returns principal and privilege list.

Parameter Type Required Description
securable_type string yes One of: CATALOG, SCHEMA, TABLE, VOLUME, FUNCTION
full_name string yes Full name of the securable

permissions_get_object_permissions

Get access control list for a workspace object. Returns principal and permission levels.

Parameter Type Required Description
object_type string yes One of: clusters, jobs, pipelines, sql/warehouses, etc.
object_id string yes Object ID

Security

  • Authentication uses the Databricks SDK's standard credential chain. Tokens are never passed directly as tool parameters.
  • The token_env_var parameter accepts only the name of an environment variable, not the token value itself. The value is read locally and never leaves the machine.
  • SQL execution enforces a max_rows cap of 10,000. File reads enforce a max_bytes cap of 10 MB.
  • Secret values are never exposed. The secrets_list_keys tool returns key names only.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databricks_utils_mcp-0.1.0.tar.gz (81.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databricks_utils_mcp-0.1.0-py3-none-any.whl (42.2 kB view details)

Uploaded Python 3

File details

Details for the file databricks_utils_mcp-0.1.0.tar.gz.

File metadata

File hashes

Hashes for databricks_utils_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 82367d67a2554ea954434cea10aabacbf3d1316e6d49068105dbf3804c10fddd
MD5 5f1c8a43f3a4570cdd09b5cb1bf29d1b
BLAKE2b-256 42e60180c35adf741e83b81acbb833a05282ebe9be8b71ea6fd6a0cf849d7a5b

See more details on using hashes here.

File details

Details for the file databricks_utils_mcp-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for databricks_utils_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c9b8ea308b547d4a60e1469ae35aae8d037afd10b92dc623f50d5bc5e98b1ba7
MD5 ec82cabedd1fac1effff2644521830df
BLAKE2b-256 35c1c7ad69c0aa2973e92d49009213266e2313e27fe7fac2e174adb98d22aac2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page