Skip to main content

BigQuery MCP server optimized for quick navigation of larger projects and datasets.

Project description

🗂️ BigQuery MCP Server

Practical MCP server for navigating BigQuery datasets and tables by LLMs. Designed for larger projects with many datasets/tables, optimized to keep LLM context small while staying fast and safe.

  • Minimal by default: list datasets and tables names; fetch details only when asked
  • Navigate larger projects: filter by name, request detailed metadata/schemas on demand
  • Quick table insight: optional schema, column descriptions and fill-rate to help an agent decide relevance fast
  • Safe to run: read-only query execution with guardrails (SELECT/WITH only, comment stripping)

Quick Start

Prerequisites: Python 3.10+ and uv package manager

🚀 Quick Setup

Option 1: Pull direct from GitHub

# 1. Authenticate
gcloud auth application-default login

# 2. Run server
uv run --with 'bigquery-mcp@git+https://github.com/pvoo/bigquery-mcp.git' \
  bigquery-mcp --project YOUR_PROJECT --location US

Option 2: Clone locally (development setup)

# 1. Clone and setup
git clone https://github.com/pvoo/bigquery-mcp.git
cd bigquery-mcp

# 2. Configure environment
cp .env.example .env
# Edit .env with your project and location

# 3. Run or inspect
make run      # Start server
make inspect  # Open MCP inspector

🔧 MCP Client Configuration

Option 1: Basic MCP config Should work as mcp.json config for most tools like Cursor, Claude Code, etc.

{
  "mcpServers": {
    "bigquery": {
      "command": "uv",
      "args": [
        "run", "--with", "bigquery-mcp@git+https://github.com/pvoo/bigquery-mcp.git",
        "bigquery-mcp", "--project", "your-project-id", "--location", "US"
      ]
    }
  }
}

Option 2: Local clone config (for development)

# Clone first
git clone https://github.com/pvoo/bigquery-mcp.git
{
  "mcpServers": {
    "bigquery": {
      "command": "uv",
      "args": ["--directory", "/absolute/path/to/bigquery-mcp", "run", "bigquery-mcp"],
      "env": {
        "GCP_PROJECT_ID": "your-project-id",
        "BIGQUERY_LOCATION": "US"
      }
    }
  }
}

🧪 Test Your Setup

# Test with MCP inspector
npx @modelcontextprotocol/inspector \
  uv run --with 'bigquery-mcp @ git+https://github.com/pvoo/bigquery-mcp.git' \
  bigquery-mcp --project YOUR_PROJECT --location US

🛠️ Tools Overview

This MCP server provides 4 core BigQuery tools optimized for LLM efficiency:

📊 Smart Dataset & Table Discovery

  • list_datasets - Dual mode: basic (names only) vs detailed (full metadata)
  • list_tables - Context-aware table browsing with optional schema details
  • get_table - Complete table analysis with schema and sample data

🔍 Safe Query Execution

  • run_query - Execute SELECT/WITH queries only, with cost tracking and safety validation

Key Features:

  • Minimal by default - 70% fewer tokens in basic mode
  • Safe queries only - Blocks all write operations
  • LLM-optimized - Returns structured data perfect for AI analysis
  • Cost transparent - Shows bytes processed for each query

🏗️ Development Setup

Local Development

# Clone and setup
git clone https://github.com/pvoo/bigquery-mcp.git
cd bigquery-mcp
make install  # Setup environment + pre-commit hooks

# Development workflow
make run      # Start server
make test     # Run test suite
make check    # Lint + format + typecheck
make inspect  # Launch MCP inspector

Testing & Quality

make test                    # Full test suite
pytest tests/test_safety.py  # SQL safety validation tests
pytest tests/test_server.py  # Core server functionality tests
make check                   # Run all quality checks

Arguments available

Variable Required Description
GCP_PROJECT_ID Yes Google Cloud project ID
BIGQUERY_LOCATION Yes BigQuery region (e.g., US, EU, us-central1)
GOOGLE_APPLICATION_CREDENTIALS No Path to service account JSON
BIGQUERY_MAX_RESULTS No Default max query results (default: 20)
BIGQUERY_ALLOWED_DATASETS No Comma-separated allowed datasets

Authentication Methods:

  1. Application Default Credentials (via gcloud auth application-default login)
  2. Service Account Key (set GOOGLE_APPLICATION_CREDENTIALS)

Required BigQuery Permissions: bigquery.datasets.get, bigquery.datasets.list, bigquery.tables.list, bigquery.tables.get, bigquery.jobs.create, bigquery.data.get

🚨 Troubleshooting

Authentication Issues:

# Check current auth
gcloud auth application-default print-access-token

# Re-authenticate
gcloud auth application-default login

# Enable BigQuery API
gcloud services enable bigquery.googleapis.com

MCP Connection Issues:

  • Ensure absolute paths in MCP config
  • Test server manually: make run
  • Check that project and location environment variables or args are set correctly

Performance Issues:

  • Use {"detailed": false} for faster responses
  • Add search filters: {"search": "pattern"}
  • Reduce max_results for large datasets

💡 Usage Examples

📊 SQL Query Example

-- Query public datasets
SELECT
    EXTRACT(YEAR FROM pickup_datetime) as year,
    COUNT(*) as trips,
    ROUND(AVG(fare_amount), 2) as avg_fare
FROM `bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2020`
WHERE pickup_datetime BETWEEN '2020-01-01' AND '2020-12-31'
GROUP BY year

🤖 Example: Usage with Claude Code subagent

Scenario: Use the specialized BigQuery Table Analyst agent in Claude Code to automatically explore your data warehouse, analyze table relationships, and provide structured insights. By using the subagent you can take the context used for analyzing the tables out of the main thread and return actionable insights into the main agent thread for writing SQL or analyzing.

Setup:

# 1. Clone and configure
git clone https://github.com/pvoo/bigquery-mcp.git
cd bigquery-mcp

# 2. Setup environment
export GCP_PROJECT_ID="your-project-id"
export BIGQUERY_LOCATION="US"
gcloud auth application-default login

# 3. Launch Claude Code
claude-code

Example Usage:

💬 You: "I need to understand our sales data structure and find tables related to customer orders"

🤖 Claude: I'll use the BigQuery Table Analyst agent to explore your sales datasets and identify relevant tables with their relationships.

[Agent automatically:]
- Lists all datasets to identify sales-related ones
- Explores table schemas with detailed metadata
- Shows actual sample data from key tables
- Discovers join relationships between tables
- Provides ready-to-use SQL queries

What the Agent Returns:

  • Table schemas with column descriptions and types
  • Sample data showing actual values (not placeholders)
  • Join relationships with working SQL examples
  • Data quality insights (null rates, freshness, etc.)
  • Actionable SQL queries you can immediately execute

🤝 Contributing

We welcome contributions! Looking forward to your feedback for improvements.

Quick Start:

# Fork on GitHub, then:
git clone https://github.com/yourusername/bigquery-mcp.git
cd bigquery-mcp
make install  # Setup dev environment
make check    # Verify everything works

# Make changes, then:
make test     # Run tests
make check    # Quality checks
# Submit PR!

Development Guidelines:

  • Add tests for new features
  • Update documentation
  • Follow existing code style (enforced by pre-commit hooks)
  • Ensure all quality checks pass

Found an issue or have a feature request?


🌟 Star this repo if it helps you!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigquery_mcp-0.1.1.tar.gz (117.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bigquery_mcp-0.1.1-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file bigquery_mcp-0.1.1.tar.gz.

File metadata

  • Download URL: bigquery_mcp-0.1.1.tar.gz
  • Upload date:
  • Size: 117.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bigquery_mcp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1a94492fcd0ced2651186543c0e6b3a2a3b39cacbab79d975c28e050940c3fa3
MD5 8a428359e4f7938169896f8f0bc51464
BLAKE2b-256 76f49e940f82af9a59842e2c96999df9fafdd5574200054071db9a1f4ede05ea

See more details on using hashes here.

Provenance

The following attestation bundles were made for bigquery_mcp-0.1.1.tar.gz:

Publisher: release.yml on pvoo/bigquery-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bigquery_mcp-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: bigquery_mcp-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 17.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bigquery_mcp-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2f0aa2cb77ceb2230944231416fbdcf22c206a162d7bd9e47859aa76403f11e2
MD5 cea0c73f8ac5d28fa5acb46619765d1b
BLAKE2b-256 d9d60e0bcba1e542f7e062cb1b6d5cf389776abfc4b26769eff717061ffc8598

See more details on using hashes here.

Provenance

The following attestation bundles were made for bigquery_mcp-0.1.1-py3-none-any.whl:

Publisher: release.yml on pvoo/bigquery-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page