Skip to main content

Databricks MCP Server for exploring Unity Catalog - catalogs, schemas, tables, and data

Project description

Databricks MCP Server

A Model Context Protocol (MCP) server for exploring Databricks Unity Catalog. Browse catalogs, schemas, tables, and query sample data.

Features

  • 🔐 Secure U2M Authentication using Databricks CLI OAuth flow
  • 📂 List Catalogs - Browse all available Unity Catalog catalogs
  • 📁 List Schemas - Explore schemas within a catalog
  • 📋 List Tables - View tables in a catalog.schema
  • 📊 Table Details - Get column names, data types, and metadata
  • 📝 Sample Data - Query sample rows from tables (5-50 rows)
  • ♻️ Token Refresh - Gracefully handles token expiration

Installation

Using pip

pip install mareana-databricks-mcp-server

Using uvx

uvx mareana-databricks-mcp-server

From source

git clone https://github.com/mareana/databricks-mcp-server.git
cd databricks-mcp-server
pip install -e .

Prerequisites

1. Install Databricks CLI

# Using pip
pip install databricks-cli

# Or using homebrew (macOS)
brew tap databricks/tap
brew install databricks

2. Authenticate with Databricks

The server uses U2M (User-to-Machine) OAuth authentication via the Databricks CLI:

# Login to your Databricks workspace
databricks auth login --host https://your-workspace.cloud.databricks.com

# Or with a named profile
databricks auth login --host https://your-workspace.cloud.databricks.com --profile my_profile

3. Verify Authentication

databricks auth token
# Should output a valid access token

Configuration

Variable Description Default
DATABRICKS_PROFILE Databricks CLI profile to use DEFAULT

Example

# Use a specific profile
export DATABRICKS_PROFILE="my_workspace_profile"

Usage

Running the Server

With Python:

mareana-databricks-mcp-server

With uvx:

uvx mareana-databricks-mcp-server

From source:

python -m databricks_mcp_server.server

MCP Client Configuration

Add to your MCP client configuration (e.g., Claude Desktop):

{
  "mcpServers": {
    "databricks": {
      "command": "uvx",
      "args": ["mareana-databricks-mcp-server"],
      "env": {
        "DATABRICKS_PROFILE": "my_profile"
      }
    }
  }
}

Or using Python directly:

{
  "mcpServers": {
    "databricks": {
      "command": "python3",
      "args": ["-m", "databricks_mcp_server.server"],
      "env": {
        "DATABRICKS_PROFILE": "my_profile"
      }
    }
  }
}

Available Tools

who_am_i

Get the current authenticated Databricks user.

Returns: Username, display name, and email of the authenticated user.

Example Response:

{
  "username": "user@example.com",
  "display_name": "John Doe",
  "active": true,
  "emails": ["user@example.com"]
}

list_catalogs

List all available catalogs in Unity Catalog.

Returns: List of catalogs accessible to the current user.

Example Response:

{
  "total_count": 3,
  "catalogs": [
    {"name": "main", "owner": "admin@example.com"},
    {"name": "dev", "owner": "admin@example.com"},
    {"name": "staging", "owner": "admin@example.com"}
  ]
}

list_schemas

List all schemas in a specified catalog.

Parameter Type Required Description
catalog_name string Yes Name of the catalog

Example Response:

{
  "catalog": "main",
  "total_count": 2,
  "schemas": [
    {"name": "default", "full_name": "main.default"},
    {"name": "raw_data", "full_name": "main.raw_data"}
  ]
}

list_tables

List all tables in a specified catalog and schema.

Parameter Type Required Description
catalog_name string Yes Name of the catalog
schema_name string Yes Name of the schema

Example Response:

{
  "catalog": "main",
  "schema": "raw_data",
  "total_count": 2,
  "tables": [
    {"name": "users", "full_name": "main.raw_data.users", "table_type": "MANAGED"},
    {"name": "events", "full_name": "main.raw_data.events", "table_type": "EXTERNAL"}
  ]
}

get_table_details

Get detailed information about a table including column names, data types, and metadata.

Parameter Type Required Description
catalog_name string Yes Name of the catalog
schema_name string Yes Name of the schema
table_name string Yes Name of the table

Example Response:

{
  "full_name": "main.raw_data.users",
  "table_type": "MANAGED",
  "data_source_format": "DELTA",
  "column_count": 3,
  "columns": [
    {"position": 1, "name": "id", "type": "LONG", "nullable": false},
    {"position": 2, "name": "name", "type": "STRING", "nullable": true},
    {"position": 3, "name": "created_at", "type": "TIMESTAMP", "nullable": true}
  ]
}

get_sample_data

Get sample data from a table.

Parameter Type Required Description
catalog_name string Yes Name of the catalog
schema_name string Yes Name of the schema
table_name string Yes Name of the table
limit integer No Number of rows (min: 5, max: 50, default: 10)

Example Response:

{
  "table": "main.raw_data.users",
  "warehouse_used": "Starter Warehouse",
  "row_count": 3,
  "columns": ["id", "name", "created_at"],
  "data": [
    {"id": 1, "name": "Alice", "created_at": "2024-01-01T00:00:00Z"},
    {"id": 2, "name": "Bob", "created_at": "2024-01-02T00:00:00Z"},
    {"id": 3, "name": "Charlie", "created_at": "2024-01-03T00:00:00Z"}
  ]
}

execute_query

Execute an arbitrary SQL query on Databricks and return the results.

Parameter Type Required Description
sql string Yes The SQL query to execute
max_rows integer No Maximum number of rows to return (default: 1000, max: 10000)

Example Response:

{
  "status": "success",
  "warehouse_used": "Starter Warehouse",
  "row_count": 2,
  "columns": [
    {"name": "matl_num", "type": "STRING"},
    {"name": "total_stock", "type": "DOUBLE"}
  ],
  "data": [
    {"matl_num": "9000082572", "total_stock": 150.5},
    {"matl_num": "9000009550", "total_stock": 42.0}
  ]
}

Token Expiration Handling

The server gracefully handles token expiration. If your token expires, you'll receive a helpful message:

❌ Authentication failed. Your token may have expired.
Please re-authenticate using:
  databricks auth login --profile my_profile

Simply run the suggested command to refresh your authentication.

Troubleshooting

"No SQL warehouses available"

Sample data queries require an active SQL warehouse. Contact your Databricks admin to ensure you have access to a warehouse.

"Permission denied"

You may not have access to the specified catalog/schema/table. Check your Unity Catalog permissions with your admin.

"Table not found"

Verify the catalog, schema, and table names are correct. Use list_catalogs, list_schemas, and list_tables to explore available objects.

License

MIT License - see LICENSE for details.

Author

Shashanka G - shashanka.g@mareana.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mareana_databricks_mcp_server-0.1.3.tar.gz (101.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mareana_databricks_mcp_server-0.1.3-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file mareana_databricks_mcp_server-0.1.3.tar.gz.

File metadata

File hashes

Hashes for mareana_databricks_mcp_server-0.1.3.tar.gz
Algorithm Hash digest
SHA256 6d7de55c9f5731c8ac588cf51e869f1f538a8f20a1d013f51e8414890eba24bf
MD5 ee90cb03d49861f64152aeb3b88a1b7a
BLAKE2b-256 613b62946e1a115545c6b57516f4c6e5fa885b63dd8522d81f04160dfc9e42f1

See more details on using hashes here.

File details

Details for the file mareana_databricks_mcp_server-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for mareana_databricks_mcp_server-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9d8be89bd3ae9b3a2d42414f9af122656a239d4ec9a7b0885a2eacbd2c68fead
MD5 c0a95f4660c3baded7cd92003d16bdeb
BLAKE2b-256 c121f305be58a28ed1d2feed2e412a09d2711fcb59e6ab30eb32060094061f9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page