MCP server for Databricks with enhanced Genie AI integration - natural language data analysis
Project description
Databricks MCP Genie
A Model Context Protocol (MCP) server with enhanced Genie AI integration that provides seamless natural language interaction between AI assistants (like Claude Desktop, Cursor) and Databricks workspaces.
What This Does
Enables AI assistants to directly interact with your Databricks workspace:
- Execute SQL queries and manage warehouses
- Control clusters (create, start, stop, monitor)
- Run jobs and notebooks
- Ask natural language questions with Genie AI
- Manage Unity Catalog (catalogs, schemas, tables)
- Work with DBFS, repos, and libraries
Quick Start
For Cursor Users
Team Installation (Recommended): See Cursor Setup Guide for one-click installation instructions.
Quick install:
pip install databricks-mcp-genie
Then configure in Cursor settings - full details in the Cursor Setup Guide.
Prerequisites
- Python 3.10 or higher
- Databricks workspace with personal access token
- Cursor IDE, Claude Desktop, or any MCP-compatible client
Installation
# Install from PyPI (recommended)
pip install databricks-mcp-genie
# Or install from source
git clone https://github.com/sidart10/databricks-mcp-genie.git
cd databricks-mcp-genie
pip install -e ".[dev]"
Configuration
-
Get your Databricks credentials:
- Workspace URL:
https://your-workspace.cloud.databricks.com - Personal Access Token: Generate from User Settings > Developer > Access Tokens
- Workspace URL:
-
Configure MCP client (Claude Desktop example):
Edit ~/.config/Claude/claude_desktop_config.json:
{
"mcpServers": {
"databricks": {
"command": "/path/to/databrics-mcp-server/.venv/bin/python",
"args": ["-m", "databricks_mcp.main"],
"cwd": "/path/to/databrics-mcp-server",
"env": {
"DATABRICKS_HOST": "https://your-workspace.cloud.databricks.com",
"DATABRICKS_TOKEN": "your-personal-access-token-here"
}
}
}
}
- Restart Claude Desktop
Verify Installation
# Test server starts correctly
.venv/bin/python -m databricks_mcp.main
# Run test suite
.venv/bin/pytest tests/ -v
# Quick server test script
./test_server.sh
Available Features
43 MCP Tools Across 9 API Modules
Genie AI (5 tools) - Natural language data analysis
list_genie_spaces- List available Genie AI spacesstart_genie_conversation- Ask questions in natural languagesend_genie_followup- Continue conversations with contextget_genie_message_status- Check message processing statusget_genie_query_results- Retrieve SQL results from Genie
Clusters API (6 tools)
list_clusters,create_cluster,get_clusterstart_cluster,terminate_cluster
SQL API (1 tool)
execute_sql- Run SQL queries with warehouse
Jobs API (9 tools)
list_jobs,create_job,delete_job,run_joblist_job_runs,get_run_status,cancel_runrun_notebook,sync_repo_and_run_notebook
Notebooks API (5 tools)
list_notebooks,export_notebook,import_notebookdelete_workspace_object,get_workspace_file_content,get_workspace_file_info
DBFS API (3 tools)
list_files,dbfs_put,dbfs_delete
Unity Catalog API (7 tools)
list_catalogs,create_cataloglist_schemas,create_schemalist_tables,create_table,get_table_lineage
Repos API (4 tools)
list_repos,create_repo,update_repo,pull_repo
Libraries API (3 tools)
install_library,uninstall_library,list_cluster_libraries
Usage Examples
Using with Claude Desktop
Once configured, you can ask Claude to interact with Databricks:
"List all my running clusters"
"Execute this SQL query: SELECT * FROM my_catalog.my_schema.my_table LIMIT 10"
"Ask Genie: What were the top products by revenue last month?"
"Create a new job to run my ETL notebook daily"
Programmatic Usage
from databricks_mcp.server import DatabricksMCPServer
# Initialize server
server = DatabricksMCPServer()
# Use via MCP protocol
server.run()
Direct API Usage
from databricks_mcp.api import clusters, genie, sql
# List clusters
clusters_list = await clusters.list_clusters()
# Ask Genie a question
response = await genie.start_conversation(
space_id="01efc298aabd1ae9bac6128988a6eaaa",
question="Show me revenue trends by product category"
)
# Execute SQL
results = await sql.execute_sql(
statement="SELECT * FROM sales.orders LIMIT 100",
warehouse_id="your-warehouse-id"
)
Project Structure
databrics-mcp-server/
├── databricks_mcp/ # Main Python package
│ ├── api/ # API modules (clusters, sql, genie, etc.)
│ ├── core/ # Core utilities and config
│ ├── server/ # MCP server implementation
│ └── cli/ # CLI commands
├── tests/ # Test suite
├── examples/ # Usage examples
├── scripts/ # Utility scripts
├── docs/ # Documentation
├── pyproject.toml # Package configuration
├── .mcp.json # MCP client configuration
└── test_server.sh # Quick server test
Troubleshooting
Server Won't Start
Check logs: databricks_mcp.log
Common issues:
- Invalid credentials in
.mcp.json - Incorrect Python path in MCP config
- Missing dependencies (run
pip install -e ".[dev]")
Import Errors
# Verify all imports work
.venv/bin/python -c "from databricks_mcp.server import DatabricksMCPServer"
.venv/bin/python -c "from databricks_mcp.api import clusters, sql, genie"
Connection Issues
Verify credentials:
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="your-token"
.venv/bin/python -c "
from databricks_mcp.api import clusters
import asyncio
print(asyncio.run(clusters.list_clusters()))
"
See TROUBLESHOOTING.md for detailed solutions.
Development
Running Tests
# All tests
.venv/bin/pytest tests/ -v
# Specific test file
.venv/bin/pytest tests/test_clusters.py -v
# With coverage
.venv/bin/pytest tests/ --cov=databricks_mcp
Code Quality
# Format code
.venv/bin/black databricks_mcp/
# Lint
.venv/bin/pylint databricks_mcp/
Adding New Tools
- Add API function in
databricks_mcp/api/ - Register tool in
databricks_mcp/server/databricks_mcp_server.py:
@self.tool(
name="your_tool_name",
description="What your tool does with parameters: param1 (required), param2 (optional)"
)
async def your_tool(params: Dict[str, Any]) -> List[TextContent]:
try:
actual_params = _unwrap_params(params)
result = await your_api_module.your_function(actual_params)
return [{"type": "text", "text": json.dumps(result)}]
except Exception as e:
logger.error(f"Error: {str(e)}")
return [{"type": "text", "text": json.dumps({"error": str(e)})}]
Documentation
Setup & Installation
- Cursor Setup Guide - One-click installation for Cursor (recommended for teams)
- QUICK_START.md - Getting started guide
- Deployment Summary - Package distribution overview
Development & Publishing
- PUBLISHING.md - How to publish to PyPI
- TROUBLESHOOTING.md - Common issues and solutions
- ENHANCEMENTS.md - Feature enhancements and roadmap
Requirements
- Python >=3.10
- mcp[cli] >=1.2.0
- httpx
- databricks-sdk
- pytest (dev)
- black (dev)
- pylint (dev)
License
MIT License - See LICENSE file for details
Acknowledgments
Package: databricks-mcp-genie Maintainer: Sid Original Author: Olivier Debeuf De Rijcker (databricks-mcp) Repository: https://github.com/sidart10/databricks-mcp-genie
Special thanks to:
- Olivier Debeuf De Rijcker for the original databricks-mcp implementation
- Anthropic for Claude and the MCP protocol
- Databricks for their comprehensive SDK and Genie AI
- The open source community
Built with Claude Code - AI-assisted development tool by Anthropic
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file databricks_mcp_genie-1.0.0.tar.gz.
File metadata
- Download URL: databricks_mcp_genie-1.0.0.tar.gz
- Upload date:
- Size: 3.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
234cb429ad2140f87faf668fed36b81dd02f0b6da8f999246d885d0abcf24f3c
|
|
| MD5 |
de38bf0271ffa2ded42b54fc1142fd7f
|
|
| BLAKE2b-256 |
6678a84f6897881cdd2e785d7d6ff7cf2a03461acb62fede0bcdc32c70e83e51
|
Provenance
The following attestation bundles were made for databricks_mcp_genie-1.0.0.tar.gz:
Publisher:
publish.yml on sidart10/databrics-mcp-server
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
databricks_mcp_genie-1.0.0.tar.gz -
Subject digest:
234cb429ad2140f87faf668fed36b81dd02f0b6da8f999246d885d0abcf24f3c - Sigstore transparency entry: 700090516
- Sigstore integration time:
-
Permalink:
sidart10/databrics-mcp-server@ba1e934c1d21a56305e94d1c405658b1e0005c6b -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/sidart10
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ba1e934c1d21a56305e94d1c405658b1e0005c6b -
Trigger Event:
release
-
Statement type:
File details
Details for the file databricks_mcp_genie-1.0.0-py3-none-any.whl.
File metadata
- Download URL: databricks_mcp_genie-1.0.0-py3-none-any.whl
- Upload date:
- Size: 41.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ba0939ab267a9f762020d017566558d0e3d4a0d75d272420ce552494695dd6c
|
|
| MD5 |
18fbcdf1c572ff9a9e813b224a2b48fe
|
|
| BLAKE2b-256 |
b04d4ebe8755de629d34898cd028f16653e444d34bbd310ef0005b8b5151f81e
|
Provenance
The following attestation bundles were made for databricks_mcp_genie-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on sidart10/databrics-mcp-server
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
databricks_mcp_genie-1.0.0-py3-none-any.whl -
Subject digest:
1ba0939ab267a9f762020d017566558d0e3d4a0d75d272420ce552494695dd6c - Sigstore transparency entry: 700090519
- Sigstore integration time:
-
Permalink:
sidart10/databrics-mcp-server@ba1e934c1d21a56305e94d1c405658b1e0005c6b -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/sidart10
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ba1e934c1d21a56305e94d1c405658b1e0005c6b -
Trigger Event:
release
-
Statement type: