Skip to main content

Lightweight Databricks Documentation MCP Server

Project description

Databricks Documentation MCP Server

A lightweight, stateless Model Context Protocol (MCP) server that lets AI assistants read and search docs.databricks.com in real time — no local cache, no database, no crawling required.

Features

  • Live fetch — always returns current documentation content, never stale
  • Full-text search — real-time, site-scoped search via DuckDuckGo site: operator
  • Docusaurus-aware extraction — strips navigation, sidebars, and page chrome; returns clean markdown
  • Section extraction — pull specific h2 sections from long reference pages
  • Paginationstart_index and max_length parameters for large pages

Prerequisites

  • Python 3.10 or later
  • uv (recommended) or pip

Installation

From a release (recommended)

Releases are published to the GitLab Package Registry. Replace <gitlab-url> and <project-id> with your instance details (visible on the Deploy → Package Registry page).

# pip
pip install databricks-docs-mcp \
  --index-url https://<gitlab-url>/api/v4/projects/<project-id>/packages/pypi/simple

# uv
uv add databricks-docs-mcp \
  --index https://<gitlab-url>/api/v4/projects/<project-id>/packages/pypi/simple

Once installed, use uvx to run it without keeping a permanent install:

uvx databricks-docs-mcp

MCP client configuration (release install)

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or
%APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "databricks-docs": {
      "command": "uvx",
      "args": ["databricks-docs-mcp"]
    }
  }
}

VS Code (GitHub Copilot)

Add to .vscode/mcp.json in your workspace:

{
  "servers": {
    "databricks-docs": {
      "type": "stdio",
      "command": "uvx",
      "args": ["databricks-docs-mcp"]
    }
  }
}

To pin a specific version, use "args": ["databricks-docs-mcp==1.2.0"].

From source (development)

git clone <repo-url>
cd databricks-mcp
uv sync --extra dev

MCP client config for a local clone:

{
  "servers": {
    "databricks-docs": {
      "type": "stdio",
      "command": "uv",
      "args": ["--directory", "/absolute/path/to/databricks-mcp", "run", "databricks-docs-mcp"]
    }
  }
}

Environment Variables

Variable Default Description
MCP_USER_AGENT Mozilla/5.0 (compatible; DatabricksDocsMCP/1.0) HTTP User-Agent sent with every request
FASTMCP_LOG_LEVEL WARNING Log verbosity: DEBUG, INFO, WARNING, ERROR

Tools

search_documentation

Search docs.databricks.com using a site-scoped real-time web search.

Parameter Type Default Description
query string Keywords or topic to search for
limit integer 10 Maximum results to return (max 30)

Returns a JSON array of results with URL, title, and snippet.

read_documentation

Fetch a docs.databricks.com page as clean markdown.

Parameter Type Default Description
url string Full docs.databricks.com URL
max_length integer 5000 Maximum characters to return per call
start_index integer 0 Character offset for pagination

Returns markdown-formatted page content with a continuation hint when the page is truncated.

read_sections

Extract specific h2 sections from a docs page by heading title.

Parameter Type Default Description
url string Full docs.databricks.com URL
section_titles string[] h2 heading titles to extract (case-insensitive)

Returns markdown of the matched sections only.

Basic Usage

Recommended workflow

1. Search for a topic:

search_documentation("Delta Live Tables pipeline settings")

2. Read the most relevant result:

read_documentation("https://docs.databricks.com/aws/en/dlt/settings.html")

3. Extract specific sections from large pages:

read_sections(
  "https://docs.databricks.com/aws/en/dlt/settings.html",
  ["Pipeline mode", "Compute settings"]
)

Tips

  • Databricks docs URLs follow the pattern https://docs.databricks.com/<cloud>/en/<topic>/...
    Use aws for AWS, gcp for GCP, azure for Azure.
  • Use start_index in read_documentation to page through long articles.
  • Section titles for read_sections are matched case-insensitively against <h2> headings on the page.

Development

uv sync --extra dev

Lint

uv run ruff check src/ tests/

Run tests

uv run --frozen pytest --cov --cov-branch --cov-report=term-missing

Project structure

src/
  databricks_docs_mcp/
    server.py   # MCP server and tool definitions
    utils.py    # HTML extraction and formatting utilities
    models.py   # Pydantic models for search results
tests/
  test_server.py
  test_utils.py

Attribution

Inspired by the AWS Documentation MCP Server by AWS Labs (Apache 2.0).

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databricks_docs_mcp-1.0.4.tar.gz (126.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databricks_docs_mcp-1.0.4-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file databricks_docs_mcp-1.0.4.tar.gz.

File metadata

  • Download URL: databricks_docs_mcp-1.0.4.tar.gz
  • Upload date:
  • Size: 126.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for databricks_docs_mcp-1.0.4.tar.gz
Algorithm Hash digest
SHA256 48813e10b0cc1b47670417d1c631f42fd8e56525c7ebb30070d37ae031d29767
MD5 a197ef8cfc725826324243b04bc03b85
BLAKE2b-256 c7d48c0e4472b34af6ff66302574be15fbcd4742532360984b59353c610f7574

See more details on using hashes here.

File details

Details for the file databricks_docs_mcp-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: databricks_docs_mcp-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for databricks_docs_mcp-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 de22c3e1f509fb7dd15da0efe2a28b02d00521642bad8b915993f6fcd6aa27b8
MD5 801a20a8838fa55fa00979d76a20e93b
BLAKE2b-256 9f7597feb4443a8282e93affce8ad9d9935c4a614e6211d3efb28ddcc75646a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page