Skip to main content

A Model Completion Protocol (MCP) server for Databricks

Project description

๐Ÿค– Built by Markov

When AI changes everything, you start from scratch.

Markov specializes in cutting-edge AI solutions and automation. From neural ledgers to MCP servers,
we're building the tools that power the next generation of AI-driven applications.

๐Ÿ’ผ We're always hiring exceptional engineers! Join us in shaping the future of AI.

๐ŸŒ Visit markov.bot โ€ข โœ‰๏ธ Get in Touch โ€ข ๐Ÿš€ Careers


Databricks MCP Server

A Model Completion Protocol (MCP) server for Databricks that provides access to Databricks functionality via the MCP protocol. This allows LLM-powered tools to interact with Databricks clusters, jobs, notebooks, and more.

Version 0.2.0 - Latest release with improved package structure and organization.

๐Ÿš€ One-Click Install

For Cursor Users

Click this link to install instantly:

cursor://anysphere.cursor-deeplink/mcp/install?name=databricks-mcp&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyJkYXRhYnJpY2tzLW1jcC1zZXJ2ZXIiXSwiZW52Ijp7IkRBVEFCUklDS1NfSE9TVCI6IiR7REFUQUJSSUNLU19IT1NUfSIsIkRBVEFCUklDS1NfVE9LRU4iOiIke0RBVEFCUklDS1NfVE9LRU59IiwiREFUQUJSSUNLU19XQVJFSE9VU0VfSUQiOiIke0RBVEFCUklDS1NfV0FSRUhPVVNFX0lEfSJ9fQ==

Or copy and paste this deeplink: cursor://anysphere.cursor-deeplink/mcp/install?name=databricks-mcp&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyJkYXRhYnJpY2tzLW1jcC1zZXJ2ZXIiXSwiZW52Ijp7IkRBVEFCUklDS1NfSE9TVCI6IiR7REFUQUJSSUNLU19IT1NUfSIsIkRBVEFCUklDS1NfVE9LRU4iOiIke0RBVEFCUklDS1NfVE9LRU59IiwiREFUQUJSSUNLU19XQVJFSE9VU0VfSUQiOiIke0RBVEFCUklDS1NfV0FSRUhPVVNFX0lEfSJ9fQ==

โ†’ Install Databricks MCP in Cursor โ†

This project is maintained by Olivier Debeuf De Rijcker olivier@markov.bot.

Credit for the initial version goes to @JustTryAI.

Features

  • MCP Protocol Support: Implements the MCP protocol to allow LLMs to interact with Databricks
  • Databricks API Integration: Provides access to Databricks REST API functionality
  • Tool Registration: Exposes Databricks functionality as MCP tools
  • Async Support: Built with asyncio for efficient operation

Available Tools

The Databricks MCP Server exposes the following tools:

Cluster Management

  • list_clusters: List all Databricks clusters
  • create_cluster: Create a new Databricks cluster
  • terminate_cluster: Terminate a Databricks cluster
  • get_cluster: Get information about a specific Databricks cluster
  • start_cluster: Start a terminated Databricks cluster

Job Management

  • list_jobs: List all Databricks jobs
  • run_job: Run a Databricks job

Workspace Files

  • list_notebooks: List notebooks in a workspace directory
  • export_notebook: Export a notebook from the workspace
  • get_workspace_file_content: Retrieve content of any workspace file (JSON, notebooks, scripts, etc.)
  • get_workspace_file_info: Get metadata about workspace files

File System

  • list_files: List files and directories in a DBFS path

SQL Execution

  • execute_sql: Execute a SQL statement (warehouse_id optional if DATABRICKS_WAREHOUSE_ID env var is set)

๐ŸŽ‰ Recent Updates (v0.2.0)

Major Package Refactoring:

  • โœ… Cleaner imports: Package renamed from src to databricks_mcp for better clarity
  • โœ… Organized structure: Documentation and scripts moved to dedicated directories
  • โœ… Simplified root: Cleaner project root with better organization
  • โœ… Same functionality: All features work exactly the same, just with better structure

Backwards Compatibility: All MCP tools and functionality remain unchanged. Only the internal package structure has been improved.

Installation

Quick Install (Recommended)

Use the link above to install with one click:

โ†’ Install Databricks MCP in Cursor โ†

This will automatically install the MCP server using uvx and configure it in Cursor. You'll need to set these environment variables:

  • DATABRICKS_HOST - Your Databricks workspace URL
  • DATABRICKS_TOKEN - Your Databricks personal access token
  • DATABRICKS_WAREHOUSE_ID - (Optional) Your default SQL warehouse ID

Manual Installation

Prerequisites

  • Python 3.10 or higher
  • uv package manager (recommended for MCP servers)

Setup

  1. Install uv if you don't have it already:

    # MacOS/Linux
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
    # Windows (in PowerShell)
    irm https://astral.sh/uv/install.ps1 | iex
    

    Restart your terminal after installation.

  2. Clone the repository:

    git clone https://github.com/markov-kernel/databricks-mcp.git
    cd databricks-mcp
    
  3. Run the setup script:

    # Linux/Mac
    ./scripts/setup.sh
    
    # Windows (PowerShell)
    .\scripts\setup.ps1
    

    The setup script will:

    • Install uv if not already installed
    • Create a virtual environment
    • Install all project dependencies
    • Verify the installation works

    Alternative manual setup:

    # Create and activate virtual environment
    uv venv
    
    # On Windows
    .\.venv\Scripts\activate
    
    # On Linux/Mac
    source .venv/bin/activate
    
    # Install dependencies in development mode
    uv pip install -e .
    
    # Install development dependencies
    uv pip install -e ".[dev]"
    
  4. Set up environment variables:

    # Required variables
    # Windows
    set DATABRICKS_HOST=https://your-databricks-instance.azuredatabricks.net
    set DATABRICKS_TOKEN=your-personal-access-token
    
    # Linux/Mac
    export DATABRICKS_HOST=https://your-databricks-instance.azuredatabricks.net
    export DATABRICKS_TOKEN=your-personal-access-token
    
    # Optional: Set default SQL warehouse (makes warehouse_id optional in execute_sql)
    export DATABRICKS_WAREHOUSE_ID=sql_warehouse_12345
    

    You can also create an .env file based on the .env.example template.

Running the MCP Server

Standalone

To start the MCP server directly for testing or development, run:

# Activate your virtual environment if not already active
source .venv/bin/activate 

# Run the start script (handles finding env vars from .env if needed)
./scripts/start_mcp_server.sh

This is useful for seeing direct output and logs.

Integrating with AI Clients

To use this server with AI clients like Cursor or Claude CLI, you need to register it.

Cursor Setup

  1. Open your global MCP configuration file located at ~/.cursor/mcp.json (create it if it doesn't exist).

  2. Add the following entry within the mcpServers object, replacing placeholders with your actual values and ensuring the path to start_mcp_server.sh is correct:

    {
      "mcpServers": {
        // ... other servers ...
        "databricks-mcp-local": { 
          "command": "/absolute/path/to/your/project/databricks-mcp-server/start_mcp_server.sh",
          "args": [],
          "env": {
            "DATABRICKS_HOST": "https://your-databricks-instance.azuredatabricks.net", 
            "DATABRICKS_TOKEN": "dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
            "DATABRICKS_WAREHOUSE_ID": "sql_warehouse_12345",
            "RUNNING_VIA_CURSOR_MCP": "true" 
          }
        }
        // ... other servers ...
      }
    }
    
  3. Important: Replace /absolute/path/to/your/project/databricks-mcp-server/ with the actual absolute path to this project directory on your machine.

  4. Replace the DATABRICKS_HOST and DATABRICKS_TOKEN values with your credentials.

  5. Save the file and restart Cursor.

  6. You can now invoke tools using databricks-mcp-local:<tool_name> (e.g., databricks-mcp-local:list_jobs).

Claude CLI Setup

  1. Use the claude mcp add command to register the server. Provide your credentials using the -e flag for environment variables and point the command to the start_mcp_server.sh script using -- followed by the absolute path:

    claude mcp add databricks-mcp-local \
      -s user \
      -e DATABRICKS_HOST="https://your-databricks-instance.azuredatabricks.net" \
      -e DATABRICKS_TOKEN="dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" \
      -e DATABRICKS_WAREHOUSE_ID="sql_warehouse_12345" \
      -- /absolute/path/to/your/project/databricks-mcp-server/start_mcp_server.sh
    
  2. Important: Replace /absolute/path/to/your/project/databricks-mcp-server/ with the actual absolute path to this project directory on your machine.

  3. Replace the DATABRICKS_HOST and DATABRICKS_TOKEN values with your credentials.

  4. You can now invoke tools using databricks-mcp-local:<tool_name> in your Claude interactions.

Querying Databricks Resources

The repository includes utility scripts to quickly view Databricks resources:

# View all clusters
uv run scripts/show_clusters.py

# View all notebooks
uv run scripts/show_notebooks.py

Usage Examples

SQL Execution with Default Warehouse

# With DATABRICKS_WAREHOUSE_ID set, warehouse_id is optional
await session.call_tool("execute_sql", {
    "statement": "SELECT * FROM my_table LIMIT 10"
})

# You can still override the default warehouse
await session.call_tool("execute_sql", {
    "statement": "SELECT * FROM my_table LIMIT 10",
    "warehouse_id": "sql_warehouse_specific"
})

Workspace File Content Retrieval

# Get JSON file content from workspace
await session.call_tool("get_workspace_file_content", {
    "workspace_path": "/Users/user@domain.com/config/settings.json"
})

# Get notebook content in Jupyter format
await session.call_tool("get_workspace_file_content", {
    "workspace_path": "/Users/user@domain.com/my_notebook",
    "format": "JUPYTER"
})

# Get file metadata without downloading content
await session.call_tool("get_workspace_file_info", {
    "workspace_path": "/Users/user@domain.com/large_file.py"
})

Project Structure

databricks-mcp/
โ”œโ”€โ”€ databricks_mcp/                  # Main package (renamed from src/)
โ”‚   โ”œโ”€โ”€ __init__.py                  # Package initialization
โ”‚   โ”œโ”€โ”€ __main__.py                  # Main entry point for the package
โ”‚   โ”œโ”€โ”€ main.py                      # Entry point for the MCP server
โ”‚   โ”œโ”€โ”€ api/                         # Databricks API clients
โ”‚   โ”‚   โ”œโ”€โ”€ clusters.py              # Cluster management
โ”‚   โ”‚   โ”œโ”€โ”€ jobs.py                  # Job management
โ”‚   โ”‚   โ”œโ”€โ”€ notebooks.py             # Notebook operations
โ”‚   โ”‚   โ”œโ”€โ”€ sql.py                   # SQL execution
โ”‚   โ”‚   โ””โ”€โ”€ dbfs.py                  # DBFS operations
โ”‚   โ”œโ”€โ”€ core/                        # Core functionality
โ”‚   โ”‚   โ”œโ”€โ”€ config.py                # Configuration management
โ”‚   โ”‚   โ”œโ”€โ”€ auth.py                  # Authentication
โ”‚   โ”‚   โ””โ”€โ”€ utils.py                 # Utilities
โ”‚   โ”œโ”€โ”€ server/                      # Server implementation
โ”‚   โ”‚   โ”œโ”€โ”€ __main__.py              # Server entry point
โ”‚   โ”‚   โ”œโ”€โ”€ databricks_mcp_server.py # Main MCP server
โ”‚   โ”‚   โ””โ”€โ”€ app.py                   # FastAPI app for tests
โ”‚   โ””โ”€โ”€ cli/                         # Command-line interface
โ”‚       โ””โ”€โ”€ commands.py              # CLI commands
โ”œโ”€โ”€ tests/                           # Test directory
โ”‚   โ”œโ”€โ”€ test_clusters.py             # Cluster tests
โ”‚   โ”œโ”€โ”€ test_mcp_server.py           # Server tests
โ”‚   โ””โ”€โ”€ test_*.py                    # Other test files
โ”œโ”€โ”€ scripts/                         # Helper scripts (organized)
โ”‚   โ”œโ”€โ”€ start_mcp_server.ps1         # Server startup script (Windows)
โ”‚   โ”œโ”€โ”€ start_mcp_server.sh          # Server startup script (Unix)
โ”‚   โ”œโ”€โ”€ run_tests.ps1                # Test runner script (Windows)
โ”‚   โ”œโ”€โ”€ run_tests.sh                 # Test runner script (Unix)
โ”‚   โ”œโ”€โ”€ setup.ps1                    # Setup script (Windows)
โ”‚   โ”œโ”€โ”€ setup.sh                     # Setup script (Unix)
โ”‚   โ”œโ”€โ”€ show_clusters.py             # Script to show clusters
โ”‚   โ”œโ”€โ”€ show_notebooks.py            # Script to show notebooks
โ”‚   โ”œโ”€โ”€ setup_codespaces.sh          # Codespaces setup
โ”‚   โ””โ”€โ”€ test_setup_local.sh          # Local test setup
โ”œโ”€โ”€ examples/                        # Example usage
โ”‚   โ”œโ”€โ”€ direct_usage.py              # Direct usage examples
โ”‚   โ””โ”€โ”€ mcp_client_usage.py          # MCP client examples
โ”œโ”€โ”€ docs/                            # Documentation (organized)
โ”‚   โ”œโ”€โ”€ AGENTS.md                    # Agent documentation
โ”‚   โ”œโ”€โ”€ project_structure.md         # Detailed structure docs
โ”‚   โ”œโ”€โ”€ new_features.md              # Feature documentation
โ”‚   โ””โ”€โ”€ phase1.md                    # Development phases
โ”œโ”€โ”€ .gitignore                       # Git ignore rules
โ”œโ”€โ”€ .cursor.json                     # Cursor configuration
โ”œโ”€โ”€ pyproject.toml                   # Package configuration
โ”œโ”€โ”€ uv.lock                          # Dependency lock file
โ””โ”€โ”€ README.md                        # This file

See docs/project_structure.md for a more detailed view of the project structure.

Development

Code Standards

  • Python code follows PEP 8 style guide with a maximum line length of 100 characters
  • Use 4 spaces for indentation (no tabs)
  • Use double quotes for strings
  • All classes, methods, and functions should have Google-style docstrings
  • Type hints are required for all code except tests

Linting

The project uses the following linting tools:

# Run all linters
uv run pylint databricks_mcp/ tests/
uv run flake8 databricks_mcp/ tests/
uv run mypy databricks_mcp/

Testing

The project uses pytest for testing. To run the tests:

# Run all tests with our convenient script
.\scripts\run_tests.ps1

# Run with coverage report
.\scripts\run_tests.ps1 -Coverage

# Run specific tests with verbose output
.\scripts\run_tests.ps1 -Verbose -Coverage tests/test_clusters.py

You can also run the tests directly with pytest:

# Run all tests
uv run pytest tests/

# Run with coverage report
uv run pytest --cov=databricks_mcp tests/ --cov-report=term-missing

A minimum code coverage of 80% is the goal for the project.

Documentation

  • API documentation is generated using Sphinx and can be found in the docs/api directory
  • All code includes Google-style docstrings
  • See the examples/ directory for usage examples

Examples

Check the examples/ directory for usage examples. To run examples:

# Run example scripts with uv
uv run examples/direct_usage.py
uv run examples/mcp_client_usage.py

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Ensure your code follows the project's coding standards
  2. Add tests for any new functionality
  3. Update documentation as necessary
  4. Verify all tests pass before submitting

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A Model Completion Protocol (MCP) server for interacting with Databricks services. Maintained by markov.bot.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databricks_mcp_server-0.2.1.tar.gz (40.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databricks_mcp_server-0.2.1-py3-none-any.whl (25.1 kB view details)

Uploaded Python 3

File details

Details for the file databricks_mcp_server-0.2.1.tar.gz.

File metadata

  • Download URL: databricks_mcp_server-0.2.1.tar.gz
  • Upload date:
  • Size: 40.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for databricks_mcp_server-0.2.1.tar.gz
Algorithm Hash digest
SHA256 7a06c89d8aa8c736902f0c1b9c4b0245d1e515ceea4ef6bca9b8099108c558df
MD5 2049a2d527cd03d7bc0702655d3044ce
BLAKE2b-256 c5c9b53d144e888a4dbd1f42a7371f9fef68e9c1a9c92adbccea93d374a3ed48

See more details on using hashes here.

File details

Details for the file databricks_mcp_server-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for databricks_mcp_server-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7dda7d19d9352d578f3f769e4190561bf38f48fc9d5434079801114f0510e0c8
MD5 1cfe2f40833fd7538e2e6cd660362f0f
BLAKE2b-256 948b3cd1b6e360edb67b8e8703d925ee5dee952603049cb76020580175d76c74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page