A Model Completion Protocol (MCP) server for Databricks

These details have not been verified by PyPI

Project description

🤖 Built by Markov

When AI changes everything, you start from scratch.

Markov specializes in cutting-edge AI solutions and automation. From neural ledgers to MCP servers,
we're building the tools that power the next generation of AI-driven applications.

💼 We're always hiring exceptional engineers! Join us in shaping the future of AI.

🌐 Visit markov.bot • ✉️ Get in Touch • 🚀 Careers

Databricks MCP Server

A Model Completion Protocol (MCP) server for Databricks that provides access to Databricks functionality via the MCP protocol. This allows LLM-powered tools to interact with Databricks clusters, jobs, notebooks, and more.

Version 0.4.0 - Structured MCP responses, resource caching, and resilience upgrades.

🚀 One-Click Install

For Cursor Users

Click this link to install instantly:

cursor://anysphere.cursor-deeplink/mcp/install?name=databricks-mcp&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyJkYXRhYnJpY2tzLW1jcC1zZXJ2ZXIiXSwiZW52Ijp7IkRBVEFCUklDS1NfSE9TVCI6IiR7REFUQUJSSUNLU19IT1NUfSIsIkRBVEFCUklDS1NfVE9LRU4iOiIke0RBVEFCUklDS1NfVE9LRU59IiwiREFUQUJSSUNLU19XQVJFSE9VU0VfSUQiOiIke0RBVEFCUklDS1NfV0FSRUhPVVNFX0lEfSJ9fQ==

Or copy and paste this deeplink: cursor://anysphere.cursor-deeplink/mcp/install?name=databricks-mcp&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyJkYXRhYnJpY2tzLW1jcC1zZXJ2ZXIiXSwiZW52Ijp7IkRBVEFCUklDS1NfSE9TVCI6IiR7REFUQUJSSUNLU19IT1NUfSIsIkRBVEFCUklDS1NfVE9LRU4iOiIke0RBVEFCUklDS1NfVE9LRU59IiwiREFUQUJSSUNLU19XQVJFSE9VU0VfSUQiOiIke0RBVEFCUklDS1NfV0FSRUhPVVNFX0lEfSJ9fQ==

→ Install Databricks MCP in Cursor ←

This project is maintained by Olivier Debeuf De Rijcker olivier@markov.bot.

Credit for the initial version goes to @JustTryAI.

Features

MCP Protocol Support: Implements the MCP protocol to allow LLMs to interact with Databricks
Databricks API Integration: Provides access to Databricks REST API functionality
Tool Registration: Exposes Databricks functionality as MCP tools
Async Support: Built with asyncio for efficient operation

Available Tools

The Databricks MCP Server exposes the following tools:

Cluster Management

list_clusters: List all Databricks clusters
create_cluster: Create a new Databricks cluster
terminate_cluster: Terminate a Databricks cluster
get_cluster: Get information about a specific Databricks cluster
start_cluster: Start a terminated Databricks cluster

Job Management

list_jobs: List all Databricks jobs
run_job: Run a Databricks job
run_notebook: Submit and wait for a one-time notebook run
create_job: Create a new Databricks job
delete_job: Delete a Databricks job
get_run_status: Get status information for a job run
list_job_runs: List recent runs for a job
cancel_run: Cancel a running job

Workspace Files

list_notebooks: List notebooks in a workspace directory
export_notebook: Export a notebook from the workspace
import_notebook: Import a notebook into the workspace
delete_workspace_object: Delete a notebook or directory
get_workspace_file_content: Retrieve content of any workspace file (JSON, notebooks, scripts, etc.)
get_workspace_file_info: Get metadata about workspace files

File System

list_files: List files and directories in a DBFS path
dbfs_put: Upload a small file to DBFS
dbfs_delete: Delete a DBFS file or directory

Cluster Libraries

install_library: Install libraries on a cluster
uninstall_library: Remove libraries from a cluster
list_cluster_libraries: Check installed libraries on a cluster

Repos

create_repo: Clone a Git repository
update_repo: Update an existing repo
list_repos: List repos in the workspace
pull_repo: Pull the latest commit for a Databricks repo

Unity Catalog

list_catalogs: List catalogs
create_catalog: Create a catalog
list_schemas: List schemas in a catalog
create_schema: Create a schema
list_tables: List tables in a schema
create_table: Execute a CREATE TABLE statement
get_table_lineage: Fetch lineage information for a table

Composite

sync_repo_and_run_notebook: Pull a repo and execute a notebook in one call

SQL Execution

execute_sql: Execute a SQL statement (optional warehouse_id, catalog, schema_name)

🎉 Recent Updates

Structured Output Refresh (current)

✅ Typed MCP Schemas: Tools expose precise input schemas using FastMCP's metadata (no { "params": ... } envelope).
✅ Structured Results: Each tool now returns CallToolResult with a concise text summary and the full Databricks payload in _meta['data'].
✅ Resource URIs for Large Payloads: Notebook/workspace exports stash resource://databricks/exports/{id} entries in _meta['resources'] instead of embedding large blobs.
✅ Resilience Improvements: Per-tool concurrency limits, timeouts, and retry-with-backoff for transient Databricks errors.
✅ Progress & Telemetry: Tools publish MCP progress notifications and surface _meta._request_id plus per-tool success/error counters for easier observability.
✅ Correlation IDs: All API requests and tool responses carry _meta._request_id for traceability.

v0.3.0 Highlights

✅ Repository Management: Pull latest commits from Databricks repos with pull_repo.
✅ One-time Notebook Execution: Submit and wait for notebook runs with run_notebook.
✅ Composite Operations: Combined repo sync + notebook execution with sync_repo_and_run_notebook.
✅ Enhanced Job Management: Extended job APIs with submit, status checking, and run management.

Previous Updates:

v0.2.1: Enhanced Codespaces support, documentation improvements, publishing process streamlining
v0.2.0: Major package refactoring from src/ to databricks_mcp/ structure

Backwards Compatibility: Breaking change alert — tools now require flat arguments and emit structured responses; update custom clients accordingly.

Installation

Quick Install (Recommended)

Use the link above to install with one click:

→ Install Databricks MCP in Cursor ←

This will automatically install the MCP server using uvx and configure it in Cursor. You'll need to set these environment variables:

DATABRICKS_HOST - Your Databricks workspace URL
DATABRICKS_TOKEN - Your Databricks personal access token
DATABRICKS_WAREHOUSE_ID - (Optional) Your default SQL warehouse ID

Manual Installation

Prerequisites

Python 3.10 or higher
uv package manager (recommended for MCP servers)

Setup

Install uv if you don't have it already:

# MacOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows (in PowerShell)
irm https://astral.sh/uv/install.ps1 | iex

Restart your terminal after installation.

Clone the repository:

git clone https://github.com/markov-kernel/databricks-mcp.git
cd databricks-mcp

Create a virtual environment (optional) and install dependencies for local development:

# Create and activate virtual environment
uv venv

# On Windows
.\.venv\Scripts\activate

# On Linux/Mac
source .venv/bin/activate

# Install dependencies in development mode
uv pip install -e .

# Install development dependencies
uv pip install -e ".[dev]"

Set up environment variables:

# Required variables
# Windows
set DATABRICKS_HOST=https://your-databricks-instance.azuredatabricks.net
set DATABRICKS_TOKEN=your-personal-access-token

# Linux/Mac
export DATABRICKS_HOST=https://your-databricks-instance.azuredatabricks.net
export DATABRICKS_TOKEN=your-personal-access-token

# Optional: Set default SQL warehouse (makes warehouse_id optional in execute_sql)
export DATABRICKS_WAREHOUSE_ID=sql_warehouse_12345

You can also create an .env file based on the .env.example template.

Running the MCP Server

Standalone

To start the MCP server directly for testing or development, run:

uvx databricks-mcp-server@latest

Pass --log-level DEBUG or other options using standard CLI flags:

uvx databricks-mcp-server@latest -- --log-level DEBUG

Integrating with AI Clients

To use this server with AI clients like Cursor or Claude CLI, you need to register it.

Cursor Setup

Open your global MCP configuration file located at ~/.cursor/mcp.json (create it if it doesn't exist).

Add the following entry within the mcpServers object, replacing placeholders with your actual values:

{
  "mcpServers": {
    // ... other servers ...
    "databricks-mcp-local": { 
      "command": "uvx",
      "args": ["databricks-mcp-server@latest"],
      "env": {
        "DATABRICKS_HOST": "https://your-databricks-instance.azuredatabricks.net", 
        "DATABRICKS_TOKEN": "dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
        "DATABRICKS_WAREHOUSE_ID": "sql_warehouse_12345",
        "RUNNING_VIA_CURSOR_MCP": "true" 
      }
    }
    // ... other servers ...
  }
}

Replace the DATABRICKS_HOST and DATABRICKS_TOKEN values with your credentials, then restart Cursor.
You can now invoke tools using databricks-mcp-local:<tool_name> (e.g., databricks-mcp-local:list_jobs).

Claude CLI Setup

Use the claude mcp add command to register the server. Provide your credentials using the -e flag for environment variables and point the command to uvx databricks-mcp-server@latest:

claude mcp add databricks-mcp-local \
  -s user \
  -e DATABRICKS_HOST="https://your-databricks-instance.azuredatabricks.net" \
  -e DATABRICKS_TOKEN="dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" \
  -e DATABRICKS_WAREHOUSE_ID="sql_warehouse_12345" \
  -- uvx databricks-mcp-server@latest

Replace the DATABRICKS_HOST and DATABRICKS_TOKEN values with your credentials.
You can now invoke tools using databricks-mcp-local:<tool_name> in your Claude interactions.

Usage Examples

SQL Execution with Default Warehouse

# With DATABRICKS_WAREHOUSE_ID set, warehouse_id is optional
await session.call_tool("execute_sql", {
    "statement": "SELECT * FROM my_table LIMIT 10"
})

# You can still override the default warehouse
await session.call_tool("execute_sql", {
    "statement": "SELECT * FROM my_table LIMIT 10",
    "warehouse_id": "sql_warehouse_specific"
})

Workspace File Content Retrieval

# Get JSON file content from workspace
await session.call_tool("get_workspace_file_content", {
    "path": "/Users/user@domain.com/config/settings.json"
})

# Get notebook content in Jupyter format
await session.call_tool("get_workspace_file_content", {
    "path": "/Users/user@domain.com/my_notebook",
    "format": "JUPYTER"
})

# Get file metadata without downloading content
await session.call_tool("get_workspace_file_info", {
    "path": "/Users/user@domain.com/large_file.py"
})

Repo Sync and Notebook Execution

await session.call_tool("sync_repo_and_run_notebook", {
    "repo_id": 123,
    "notebook_path": "/Repos/user/project/run_me"
})

Create Nightly ETL Job

job_conf = {
    "name": "Nightly ETL",
    "tasks": [
        {
            "task_key": "etl",
            "notebook_task": {"notebook_path": "/Repos/me/etl.py"},
            "existing_cluster_id": "abc-123"
        }
    ]
}
await session.call_tool("create_job", job_conf)

Project Structure

databricks-mcp/
├── AGENTS.md                        # Contributor guidelines (agents/LLM focus)
├── ARCHITECTURE.md                  # Deep architecture walkthrough
├── databricks_mcp/                  # Main package
│   ├── __init__.py                  # Package initialization
│   ├── __main__.py                  # Run via `python -m databricks_mcp`
│   ├── main.py                      # CLI/stdio launcher
│   ├── api/                         # Databricks API clients
│   │   ├── clusters.py              # Cluster management
│   │   ├── jobs.py                  # Job management
│   │   ├── notebooks.py             # Notebook operations
│   │   ├── sql.py                   # SQL execution
│   │   └── dbfs.py                  # DBFS operations
│   ├── core/                        # Core functionality
│   │   ├── auth.py                  # Authentication helpers
│   │   ├── config.py                # Settings and env loading
│   │   ├── logging_utils.py         # Centralized logging
│   │   └── utils.py                 # HTTP utilities & error helpers
│   ├── server/                      # MCP server implementation
│   │   ├── __main__.py              # Server entry point
│   │   ├── databricks_mcp_server.py # Main MCP server class
│   │   └── tool_helpers.py          # Shared response builders
│   └── cli/                         # Command-line interface
│       └── commands.py              # CLI commands
├── tests/                           # Test directory
│   ├── test_clusters.py             # Cluster tests
│   ├── test_mcp_server.py           # Server tests
│   └── test_*.py                    # Other test files
├── README.md                        # Project overview (this file)
├── TODO.md                          # Active refactor checklist
├── pyproject.toml                   # Package metadata
├── uv.lock                          # Dependency lock file
└── .gitignore                       # Git ignore rules

Development

Documentation

ARCHITECTURE.md — End-to-end component overview, resource flow, and integration details.
AGENTS.md — Contributor guidelines and MCP agent conventions.

Cross-Platform Notes

uvx databricks-mcp-server@latest works on macOS, Linux, and Windows (PowerShell) without per-platform scripts.
Tests run portably with uv run pytest; no shell-specific harnesses remain.
Progress notifications and structured outputs follow the MCP spec, so clients on any OS receive the same responses.

Code Standards

Python code follows PEP 8 style guide with a maximum line length of 100 characters
Use 4 spaces for indentation (no tabs)
Use double quotes for strings
All classes, methods, and functions should have Google-style docstrings
Type hints are required for all code except tests

Linting

The project uses the following linting tools:

# Run all linters
uv run pylint databricks_mcp/ tests/
uv run flake8 databricks_mcp/ tests/
uv run mypy databricks_mcp/

Testing

The project uses pytest for testing. To run the tests:

# Run all tests with our convenient script
.\scripts\run_tests.ps1

# Run with coverage report
.\scripts\run_tests.ps1 -Coverage

# Run specific tests with verbose output
.\scripts\run_tests.ps1 -Verbose -Coverage tests/test_clusters.py

You can also run the tests directly with pytest:

# Run all tests
uv run pytest tests/

# Run with coverage report
uv run pytest --cov=databricks_mcp tests/ --cov-report=term-missing

A minimum code coverage of 80% is the goal for the project.

Documentation

API documentation is generated using Sphinx and can be found in the docs/api directory
All code includes Google-style docstrings
See the examples/ directory for usage examples

Examples

Check the examples/ directory for usage examples. To run examples:

# Run example scripts with uv
uv run examples/direct_usage.py
uv run examples/mcp_client_usage.py

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Ensure your code follows the project's coding standards
Add tests for any new functionality
Update documentation as necessary
Verify all tests pass before submitting

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A Model Completion Protocol (MCP) server for interacting with Databricks services. Maintained by markov.bot.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.4

Nov 4, 2025

0.4.3

Nov 4, 2025

0.4.2

Nov 4, 2025

0.4.1

Nov 4, 2025

This version

0.4.0

Nov 4, 2025

0.3.1

Jun 7, 2025

0.3.0

Jun 5, 2025

0.2.1

Jun 5, 2025

0.2.0

Jun 5, 2025

0.1.2

Jun 5, 2025

0.1.1

Jun 5, 2025

0.1.0

Jun 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databricks_mcp_server-0.4.0.tar.gz (40.5 kB view details)

Uploaded Nov 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

databricks_mcp_server-0.4.0-py3-none-any.whl (35.6 kB view details)

Uploaded Nov 4, 2025 Python 3

File details

Details for the file databricks_mcp_server-0.4.0.tar.gz.

File metadata

Download URL: databricks_mcp_server-0.4.0.tar.gz
Upload date: Nov 4, 2025
Size: 40.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.22

File hashes

Hashes for databricks_mcp_server-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`1207f165fda0556c6af97f381867982d18f4196e5e73053d1a9a5d075a143a9b`
MD5	`3c5316cb1501d6888776188857fdb411`
BLAKE2b-256	`ca742fe6295ca4e7426ef8b95544058d1b21a2b61b931960c111d510b3f16845`

See more details on using hashes here.

File details

Details for the file databricks_mcp_server-0.4.0-py3-none-any.whl.

File metadata

Download URL: databricks_mcp_server-0.4.0-py3-none-any.whl
Upload date: Nov 4, 2025
Size: 35.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.22

File hashes

Hashes for databricks_mcp_server-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d538cccefee4ab97fcaade7e97536d0cf098aaeb7ba6357f9a1ab8f9bbbcdbe6`
MD5	`f241140dba6b99b40a1d185044313790`
BLAKE2b-256	`267db6a3cd28241d38ba39c1c75ca34f28a151435b39b899609f4a6af53c3583`

See more details on using hashes here.

databricks-mcp-server 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

🤖 Built by Markov

Databricks MCP Server

🚀 One-Click Install

For Cursor Users

Features

Available Tools

Cluster Management

Job Management

Workspace Files

File System

Cluster Libraries

Repos

Unity Catalog

Composite

SQL Execution

🎉 Recent Updates

Installation

Quick Install (Recommended)

Manual Installation

Prerequisites

Setup

Running the MCP Server

Standalone

Integrating with AI Clients

Cursor Setup

Claude CLI Setup

Usage Examples

SQL Execution with Default Warehouse

Workspace File Content Retrieval

Repo Sync and Notebook Execution

Create Nightly ETL Job

Project Structure

Development

Documentation

Cross-Platform Notes

Code Standards

Linting

Testing

Documentation

Examples

Contributing

License

About

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes