Skip to main content

A Model Completion Protocol (MCP) server for Databricks

Project description

Databricks MCP Server

A production-ready Model Completion Protocol (MCP) server that exposes Databricks REST capabilities to MCP-compatible agents and tooling. Version 0.4.1 introduces structured responses, resource caching, retry-aware networking, and end-to-end resilience improvements.


Table of Contents

  1. Key Capabilities
  2. Architecture Highlights
  3. Installation
  4. Configuration
  5. Running the Server
  6. Integrating with MCP Clients
  7. Working with Tool Responses
  8. Available Tools
  9. Development Workflow
  10. Testing
  11. Publishing Builds
  12. Support & Contact
  13. License

Key Capabilities

  • Structured MCP Responses – Each tool returns a CallToolResult with human-readable summaries in content and machine-readable payloads in _meta['data'].
  • Resource Caching – Large notebook/workspace exports are cached once and exposed as resource://databricks/exports/{id} entries under _meta['resources'].
  • Progress & Metrics – Long-running actions stream MCP progress notifications and track per-tool success/error/timeout/cancel metrics.
  • Resilient Networking – Shared HTTP client injects request IDs, enforces timeouts, and retries retryable Databricks responses (408/429/5xx) with exponential backoff.
  • Async Runtime – Built on mcp.server.FastMCP with centralized JSON logging and concurrency guards for predictable stdio behaviour.

Architecture Highlights

  • databricks_mcp/server/databricks_mcp_server.py – FastMCP server with tool registration, progress handling, metrics, and resource caching.
  • databricks_mcp/core/utils.py – HTTP utilities with correlation IDs, retries, and error mapping to DatabricksAPIError.
  • databricks_mcp/core/logging_utils.py – JSON logging configuration for stderr/file outputs.
  • databricks_mcp/core/models.py – Pydantic models (e.g., ClusterConfig) used by tool schemas.
  • Tests under tests/ mock Databricks APIs to validate orchestration, structured responses, and schema metadata without shell scripts.

For an in-depth tour of data flow and design decisions, see ARCHITECTURE.md.

Installation

Prerequisites

  • Python 3.10+
  • uv for dependency management and publishing

Quick Install (recommended)

Register the server with Cursor using the deeplink below – it resolves to uvx databricks-mcp-server@latest and picks up future updates automatically.

cursor://anysphere.cursor-deeplink/mcp/install?name=databricks-mcp&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyJkYXRhYnJpY2tzLW1jcC1zZXJ2ZXIiXSwiZW52Ijp7IkRBVEFCUklDS1NfSE9TVCI6IiR7REFUQUJSSUNLU19IT1NUfSIsIkRBVEFCUklDS1NfVE9LRU4iOiIke0RBVEFCUklDS1NfVE9LRU59IiwiREFUQUJSSUNLU19XQVJFSE9VU0VfSUQiOiIke0RBVEFCUklDS1NfV0FSRUhPVVNFX0lEfSJ9fQ==

Manual Installation

# Clone and enter the repository
git clone https://github.com/markov-kernel/databricks-mcp.git
cd databricks-mcp

# Create an isolated environment (optional but recommended)
uv venv
source .venv/bin/activate  # Linux/Mac
# .\.venv\Scriptsctivate  # Windows PowerShell

# Install package and development dependencies
uv pip install -e .
uv pip install -e ".[dev]"

Configuration

Set the following environment variables (or populate .env from .env.example).

export DATABRICKS_HOST="https://your-workspace.databricks.com"
export DATABRICKS_TOKEN="dapiXXXXXXXXXXXXXXXX"
export DATABRICKS_WAREHOUSE_ID="sql_warehouse_12345"  # optional default
export TOOL_TIMEOUT_SECONDS=300
export MAX_CONCURRENT_REQUESTS=8
export HTTP_TIMEOUT_SECONDS=60
export API_MAX_RETRIES=3
export API_RETRY_BACKOFF_SECONDS=0.5

Running the Server

uvx databricks-mcp-server@latest

Tip: append --refresh (e.g., uvx databricks-mcp-server@latest --refresh) to force uv to resolve the latest PyPI release after publishing.

To adjust logging:

uvx databricks-mcp-server@latest -- --log-level DEBUG

Integrating with MCP Clients

Cursor

{
  "mcpServers": {
    "databricks-mcp-local": {
      "command": "uvx",
      "args": ["databricks-mcp-server@latest"],
      "env": {
        "DATABRICKS_HOST": "https://your-workspace.databricks.com",
        "DATABRICKS_TOKEN": "dapiXXXXXXXXXXXXXXXX",
        "DATABRICKS_WAREHOUSE_ID": "sql_warehouse_12345",
        "RUNNING_VIA_CURSOR_MCP": "true"
      }
    }
  }
}

Restart Cursor after saving. Invoke tools as databricks-mcp-local:<tool>.

Claude CLI

claude mcp add databricks-mcp-local   -s user   -e DATABRICKS_HOST="https://your-workspace.databricks.com"   -e DATABRICKS_TOKEN="dapiXXXXXXXXXXXXXXXX"   -e DATABRICKS_WAREHOUSE_ID="sql_warehouse_12345"   -- uvx databricks-mcp-server@latest

Working with Tool Responses

Structured payloads live in _meta['data']; large resources are referenced in _meta['resources'].

result = await session.call_tool("list_clusters", {})
summary = next((block.text for block in result.content if getattr(block, "type", "") == "text"), "")
clusters = (result.meta or {}).get("data", {}).get("clusters", [])
resources = (result.meta or {}).get("resources", [])

Example – SQL Query

result = await session.call_tool("execute_sql", {"statement": "SELECT * FROM samples LIMIT 10"})
print(result.content[0].text)
rows = (result.meta or {}).get("data", {}).get("result", [])

Example – Workspace File Export

result = await session.call_tool("get_workspace_file_content", {
    "path": "/Users/user@domain.com/report.ipynb",
    "format": "SOURCE"
})
resource_uri = (result.meta or {}).get("resources", [{}])[0].get("uri")
if resource_uri:
    contents = await session.read_resource(resource_uri)

Available Tools

Category Tool Description
Clusters list_clusters, create_cluster, terminate_cluster, get_cluster, start_cluster Manage interactive clusters
Jobs list_jobs, create_job, run_job, run_notebook, sync_repo_and_run_notebook, get_run_status, list_job_runs, cancel_run Manage scheduled and ad-hoc jobs
Workspace list_notebooks, export_notebook, import_notebook, delete_workspace_object, get_workspace_file_content, get_workspace_file_info Inspect and manage workspace assets
DBFS list_files, dbfs_put, dbfs_delete Explore DBFS and manage files
SQL execute_sql Submit SQL statements with optional warehouse_id, catalog, schema_name
Libraries install_library, uninstall_library, list_cluster_libraries Manage cluster libraries
Repos create_repo, update_repo, list_repos, pull_repo Manage Databricks repos
Unity Catalog list_catalogs, create_catalog, list_schemas, create_schema, list_tables, create_table, get_table_lineage Unity Catalog operations

Development Workflow

uv run black databricks_mcp tests
uv run pylint databricks_mcp tests
uv run pytest
uv build
uv publish --token "$PYPI_TOKEN"

Testing

uv run pytest

Pytest suites mock Databricks APIs, providing deterministic structured outputs and transcript tests.

Publishing Builds

Ensure PYPI_TOKEN is available (via .env or environment) before publishing:

uv build
uv publish --token "$PYPI_TOKEN"

Support & Contact

License

Released under the MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databricks_mcp_server-0.4.1.tar.gz (37.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databricks_mcp_server-0.4.1-py3-none-any.whl (32.9 kB view details)

Uploaded Python 3

File details

Details for the file databricks_mcp_server-0.4.1.tar.gz.

File metadata

File hashes

Hashes for databricks_mcp_server-0.4.1.tar.gz
Algorithm Hash digest
SHA256 9f5fc23092a07548d7b4094f33a612f7beae9b1c1c03f78664de8f42388895f3
MD5 a7e33bed3992c7c69fb42977e1aecd2b
BLAKE2b-256 84465fc1c3ad3d0804120e1caae68918b56d9c2b8d91e37b63a4f8f12bc963bc

See more details on using hashes here.

File details

Details for the file databricks_mcp_server-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for databricks_mcp_server-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f6f7c5cb85a96f35c90090c0b67a88eca955853f92f3611fe128942a76c9bb7e
MD5 346ed3eea47e591c2c60af46e2b4ca3d
BLAKE2b-256 3b6f2008e2963b3abde9473bcfd87ffbb7588f5a751b424ca45016db30605fb9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page