Skip to main content

Databricks MCP Server for exploring Unity Catalog - catalogs, schemas, tables, and data

Project description

Databricks MCP Server

A Model Context Protocol (MCP) server for exploring Databricks Unity Catalog. Browse catalogs, schemas, tables, and query sample data.

Features

  • 🔐 Secure U2M Authentication using Databricks CLI OAuth flow
  • 📂 List Catalogs - Browse all available Unity Catalog catalogs
  • 📁 List Schemas - Explore schemas within a catalog
  • 📋 List Tables - View tables in a catalog.schema
  • 📊 Table Details - Get column names, data types, and metadata
  • 📝 Sample Data - Query sample rows from tables (5-50 rows)
  • ♻️ Token Refresh - Gracefully handles token expiration

Installation

Using pip

pip install mareana-databricks-mcp-server

Using uvx

uvx mareana-databricks-mcp-server

From source

git clone https://github.com/mareana/databricks-mcp-server.git
cd databricks-mcp-server
pip install -e .

Prerequisites

1. Install Databricks CLI

# Using pip
pip install databricks-cli

# Or using homebrew (macOS)
brew tap databricks/tap
brew install databricks

2. Authenticate with Databricks

The server uses U2M (User-to-Machine) OAuth authentication via the Databricks CLI:

# Login to your Databricks workspace
databricks auth login --host https://your-workspace.cloud.databricks.com

# Or with a named profile
databricks auth login --host https://your-workspace.cloud.databricks.com --profile my_profile

3. Verify Authentication

databricks auth token
# Should output a valid access token

Configuration

Variable Description Default
DATABRICKS_PROFILE Databricks CLI profile to use DEFAULT

Example

# Use a specific profile
export DATABRICKS_PROFILE="my_workspace_profile"

Usage

Running the Server

With Python:

mareana-databricks-mcp-server

With uvx:

uvx mareana-databricks-mcp-server

From source:

python -m databricks_mcp_server.server

MCP Client Configuration

Add to your MCP client configuration (e.g., Claude Desktop):

{
  "mcpServers": {
    "databricks": {
      "command": "uvx",
      "args": ["mareana-databricks-mcp-server"],
      "env": {
        "DATABRICKS_PROFILE": "my_profile"
      }
    }
  }
}

Or using Python directly:

{
  "mcpServers": {
    "databricks": {
      "command": "python3",
      "args": ["-m", "databricks_mcp_server.server"],
      "env": {
        "DATABRICKS_PROFILE": "my_profile"
      }
    }
  }
}

Available Tools

who_am_i

Get the current authenticated Databricks user.

Returns: Username, display name, and email of the authenticated user.

Example Response:

{
  "username": "user@example.com",
  "display_name": "John Doe",
  "active": true,
  "emails": ["user@example.com"]
}

list_catalogs

List all available catalogs in Unity Catalog.

Returns: List of catalogs accessible to the current user.

Example Response:

{
  "total_count": 3,
  "catalogs": [
    {"name": "main", "owner": "admin@example.com"},
    {"name": "dev", "owner": "admin@example.com"},
    {"name": "staging", "owner": "admin@example.com"}
  ]
}

list_schemas

List all schemas in a specified catalog.

Parameter Type Required Description
catalog_name string Yes Name of the catalog

Example Response:

{
  "catalog": "main",
  "total_count": 2,
  "schemas": [
    {"name": "default", "full_name": "main.default"},
    {"name": "raw_data", "full_name": "main.raw_data"}
  ]
}

list_tables

List all tables in a specified catalog and schema.

Parameter Type Required Description
catalog_name string Yes Name of the catalog
schema_name string Yes Name of the schema

Example Response:

{
  "catalog": "main",
  "schema": "raw_data",
  "total_count": 2,
  "tables": [
    {"name": "users", "full_name": "main.raw_data.users", "table_type": "MANAGED"},
    {"name": "events", "full_name": "main.raw_data.events", "table_type": "EXTERNAL"}
  ]
}

get_table_details

Get detailed information about a table including column names, data types, and metadata.

Parameter Type Required Description
catalog_name string Yes Name of the catalog
schema_name string Yes Name of the schema
table_name string Yes Name of the table

Example Response:

{
  "full_name": "main.raw_data.users",
  "table_type": "MANAGED",
  "data_source_format": "DELTA",
  "column_count": 3,
  "columns": [
    {"position": 1, "name": "id", "type": "LONG", "nullable": false},
    {"position": 2, "name": "name", "type": "STRING", "nullable": true},
    {"position": 3, "name": "created_at", "type": "TIMESTAMP", "nullable": true}
  ]
}

get_sample_data

Get sample data from a table.

Parameter Type Required Description
catalog_name string Yes Name of the catalog
schema_name string Yes Name of the schema
table_name string Yes Name of the table
limit integer No Number of rows (min: 5, max: 50, default: 10)

Example Response:

{
  "table": "main.raw_data.users",
  "warehouse_used": "Starter Warehouse",
  "row_count": 3,
  "columns": ["id", "name", "created_at"],
  "data": [
    {"id": 1, "name": "Alice", "created_at": "2024-01-01T00:00:00Z"},
    {"id": 2, "name": "Bob", "created_at": "2024-01-02T00:00:00Z"},
    {"id": 3, "name": "Charlie", "created_at": "2024-01-03T00:00:00Z"}
  ]
}

execute_query

Execute an arbitrary SQL query on Databricks and return the results.

Parameter Type Required Description
sql string Yes The SQL query to execute
max_rows integer No Maximum number of rows to return (default: 1000, max: 10000)

Example Response:

{
  "status": "success",
  "warehouse_used": "Starter Warehouse",
  "row_count": 2,
  "columns": [
    {"name": "matl_num", "type": "STRING"},
    {"name": "total_stock", "type": "DOUBLE"}
  ],
  "data": [
    {"matl_num": "9000082572", "total_stock": 150.5},
    {"matl_num": "9000009550", "total_stock": 42.0}
  ]
}

Token Expiration Handling

The server gracefully handles token expiration. If your token expires, you'll receive a helpful message:

❌ Authentication failed. Your token may have expired.
Please re-authenticate using:
  databricks auth login --profile my_profile

Simply run the suggested command to refresh your authentication.

Troubleshooting

"No SQL warehouses available"

Sample data queries require an active SQL warehouse. Contact your Databricks admin to ensure you have access to a warehouse.

"Permission denied"

You may not have access to the specified catalog/schema/table. Check your Unity Catalog permissions with your admin.

"Table not found"

Verify the catalog, schema, and table names are correct. Use list_catalogs, list_schemas, and list_tables to explore available objects.

License

MIT License - see LICENSE for details.

Author

Shashanka G - shashanka.g@mareana.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mareana_databricks_mcp_server-0.1.2.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mareana_databricks_mcp_server-0.1.2-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file mareana_databricks_mcp_server-0.1.2.tar.gz.

File metadata

File hashes

Hashes for mareana_databricks_mcp_server-0.1.2.tar.gz
Algorithm Hash digest
SHA256 24244800964c83f1d84064700caa5ffe65725a5afef2c872a86ce10bacfae596
MD5 dbd51f062c52e5ed42f98301542d831a
BLAKE2b-256 470d49061cb564343475a498e1b1ec48e62bf1e169bb544c6fa733b765bd033c

See more details on using hashes here.

File details

Details for the file mareana_databricks_mcp_server-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for mareana_databricks_mcp_server-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8e830f05de28241c3836ddad7b7b68aa09964d1815fc27ff08231728d282787d
MD5 ccd78dc1f45fd53278b05d402e18d10e
BLAKE2b-256 3570e70bc53c130e5ce9155d8952e559ac48508db0954f56cfba8f6151a8f6df

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page