Skip to main content

Minimal MCP server for BigQuery SQL validation and dry-run analysis

Project description

mcp-bigquery

MCP BigQuery Logo

Safe BigQuery exploration through Model Context Protocol

MIT License PyPI Version Python Support Downloads

Documentation | Quick Start | Tools | Examples


๐Ÿ“Œ What is this?

mcp-bigquery is an MCP (Model Context Protocol) server that enables AI assistants like Claude to safely interact with Google BigQuery.

๐ŸŽฏ Key Features

graph LR
    A[AI Assistant] -->|MCP Protocol| B[mcp-bigquery]
    B -->|Dry-run Only| C[BigQuery API]
    B -.->|โŒ Never Executes| D[Actual Query Execution]
  • ๐Ÿ›ก๏ธ 100% Safe: All operations are dry-run only (never executes queries)
  • ๐Ÿ’ฐ Cost Transparency: See costs before running any query
  • ๐Ÿ” Complete Analysis: Analyze dependencies and validate SQL syntax
  • ๐Ÿ“Š Schema Explorer: Browse datasets, tables, and columns with ease

โšก Why use mcp-bigquery?

Problem Solution with mcp-bigquery
๐Ÿ’ธ Accidentally running expensive queries Check costs before execution
๐Ÿ› Wasting time on SQL errors Detect syntax errors before running
๐Ÿ—บ๏ธ Unknown table structures Easily explore schemas
โš ๏ธ AI executing dangerous operations Everything is read-only and safe

๐Ÿš€ Quick Start (4 minutes)

Step 1: Install (1 minute)

pip install mcp-bigquery

Step 2: Authenticate with Google Cloud (2 minutes)

# For personal accounts
gcloud auth application-default login

# For service accounts
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json

Step 3: Configure Claude Desktop (1 minute)

Open your Claude Desktop config:

  • Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json

Add this configuration:

{
  "mcpServers": {
    "mcp-bigquery": {
      "command": "mcp-bigquery",
      "env": {
        "BQ_PROJECT": "your-gcp-project-id"  // โ† Replace with your project ID
      }
    }
  }
}

Step 4: Test It!

Restart Claude Desktop and try these questions:

"What datasets are available in my BigQuery project?"
"Can you estimate the cost of: SELECT * FROM dataset.table"
"Show me the schema for the users table"

๐Ÿ› ๏ธ Available Tools

๐Ÿ“ SQL Validation & Analysis

Tool Purpose When to Use
bq_validate_sql Check SQL syntax Before running any query
bq_dry_run_sql Get cost estimates & metadata ๐Ÿ’ฐ To check costs
bq_extract_dependencies Extract table dependencies To understand data lineage
bq_validate_query_syntax Detailed error analysis To debug SQL errors

๐Ÿ” Schema Discovery

Tool Purpose When to Use
bq_list_datasets List all datasets To explore your project
bq_list_tables List tables with partitioning info To browse a dataset
bq_describe_table Get detailed table schema To understand columns
bq_get_table_info Complete table metadata To get statistics

๐Ÿ’ก Real-World Examples

Example 1: Check Costs Before Running

# Before running an expensive query...
query = "SELECT * FROM `bigquery-public-data.github_repos.commits`"

# First, check the cost
result = bq_dry_run_sql(sql=query)
print(f"Estimated cost: ${result['usdEstimate']}")
print(f"Data processed: {result['totalBytesProcessed'] / 1e9:.2f} GB")

# Output:
# Estimated cost: $12.50
# Data processed: 2500.00 GB

Example 2: Understand Table Structure

# Check table schema
result = bq_describe_table(
    dataset_id="your_dataset",
    table_id="users"
)

# Output:
# โ”œโ”€โ”€ user_id (INTEGER, REQUIRED)
# โ”œโ”€โ”€ email (STRING, NULLABLE)
# โ”œโ”€โ”€ created_at (TIMESTAMP, REQUIRED)
# โ””โ”€โ”€ profile (RECORD, REPEATED)
#     โ”œโ”€โ”€ name (STRING)
#     โ””โ”€โ”€ age (INTEGER)

Example 3: Track Data Dependencies

# Understand query dependencies
query = """
WITH user_stats AS (
  SELECT user_id, COUNT(*) as order_count
  FROM orders
  GROUP BY user_id
)
SELECT u.name, s.order_count
FROM users u
JOIN user_stats s ON u.id = s.user_id
"""

result = bq_extract_dependencies(sql=query)

# Output:
# Tables: ['orders', 'users']
# Columns: ['user_id', 'name', 'id']
# Dependency Graph:
#   orders โ†’ user_stats โ†’ final_result
#   users โ†’ final_result

๐ŸŽจ How It Works

Your Code โ† โ†’ Claude/AI Assistant
                   โ†“
            MCP Protocol
                   โ†“
            mcp-bigquery
                   โ†“
         BigQuery API (Dry-run)
                   โ†“
             BigQuery
      (Never executes actual queries)

โš™๏ธ Configuration

Environment Variables

export BQ_PROJECT="my-project"        # GCP Project ID (required)
export BQ_LOCATION="asia-northeast1"  # Region (optional)
export SAFE_PRICE_PER_TIB="5.0"      # Price per TiB (default: $5)
export LOG_LEVEL="INFO"              # Optional log level override

Full Claude Desktop Configuration

{
  "mcpServers": {
    "mcp-bigquery": {
      "command": "mcp-bigquery",
      "env": {
        "BQ_PROJECT": "my-production-project",
        "BQ_LOCATION": "asia-northeast1",
        "SAFE_PRICE_PER_TIB": "6.0",
        "LOG_LEVEL": "WARNING"
      }
    }
  }
}

๐Ÿ”ง Troubleshooting

Common Issues & Solutions

โŒ Authentication Error

Error: Could not automatically determine credentials

Solution:

gcloud auth application-default login

โŒ Permission Error

Error: User does not have bigquery.tables.get permission

Solution: Grant BigQuery Data Viewer role

gcloud projects add-iam-policy-binding YOUR_PROJECT \
  --member="user:your-email@example.com" \
  --role="roles/bigquery.dataViewer"

โŒ Project Not Set

Error: Project ID is required

Solution: Set BQ_PROJECT in your configuration

Debug Mode

If issues persist, enable debug mode:

{
  "env": {
    "LOG_LEVEL": "INFO",
    "BQ_PROJECT": "your-project"
  }
}

๐Ÿ“š Learn More

Getting Started

For Developers

๐Ÿšฆ Project Status

Version Release Date Key Features
v0.5.0 2026-01-02 Consolidated formatters, client cache, logging controls
v0.4.2 2025-12-08 Modular schema explorer, unified client/logging controls
v0.4.1 2025-01-22 Better error handling, debug logging
v0.4.0 2025-01-22 Added 6 schema discovery tools
v0.3.0 2025-01-17 SQL analysis engine
v0.2.0 2025-01-16 Basic validation & dry-run

๐Ÿค Contributing

Pull requests are welcome! See our Contributing Guide.

# Setup development environment
git clone https://github.com/caron14/mcp-bigquery.git
cd mcp-bigquery
pip install -e ".[dev]"

# Run tests
pytest tests/

๐Ÿ“„ License

MIT License - see LICENSE for details.

๐Ÿ™ Acknowledgments

  • Google BigQuery team for the excellent API
  • Anthropic for the MCP protocol
  • All contributors and users

Built for safe BigQuery exploration ๐Ÿ›ก๏ธ

Report Bug ยท Request Feature ยท Discussions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_bigquery-0.5.0.tar.gz (53.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_bigquery-0.5.0-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file mcp_bigquery-0.5.0.tar.gz.

File metadata

  • Download URL: mcp_bigquery-0.5.0.tar.gz
  • Upload date:
  • Size: 53.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for mcp_bigquery-0.5.0.tar.gz
Algorithm Hash digest
SHA256 c82647d87a9546cb72f53ca0f0153ead1ef608466cd0802a2e0f267a01bc61b4
MD5 5ed10076f9fcb43b37b920e13c26bacd
BLAKE2b-256 1199a38eacc5f117f868fac13aedd97a9adeabe9d3b8a1c035d702a353575904

See more details on using hashes here.

File details

Details for the file mcp_bigquery-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: mcp_bigquery-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 28.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for mcp_bigquery-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a0ca18e8c96ec1ac228351230155a3f2760854b4c7f618bd32efefe21f0c950d
MD5 0936a39c907e01c3a8d6c568d65279ce
BLAKE2b-256 f02b9f2d8f4e9bc0526e131986ac74939ab3e808e31bfc0b6650c3ed84033c9f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page