BigQuery MCP server optimized for quick navigation of larger projects and datasets.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

🗂️ BigQuery MCP Server

Practical MCP server for navigating BigQuery datasets and tables by LLMs. Designed for larger projects with many datasets/tables, optimized to keep LLM context small while staying fast and safe.

Minimal by default: list datasets and tables names; fetch details only when asked
Navigate larger projects: filter by name, request detailed metadata/schemas on demand
Quick table insight: optional schema, column descriptions and fill-rate to help an agent decide relevance fast
Safe to run: read-only query execution with guardrails (SELECT/WITH only, comment stripping)
Supports vector search: Use bigquery as your vector store. See Vector Search section for full setup instructions.

Quick Start

Prerequisites: Python 3.10+ and uv package manager

🚀 Quick Setup

Option 1: Direct from PyPI (Recommended)

# 1. Authenticate
gcloud auth application-default login

# 2. Run server
uvx bigquery-mcp --project YOUR_PROJECT --location US

Option 2: Clone locally (development setup)

# 1. Clone and setup
git clone https://github.com/pvoo/bigquery-mcp.git
cd bigquery-mcp

# 2. Configure environment
cp .env.example .env
# Edit .env with your project and location

# 3. Run or inspect
make run      # Start server
make inspect  # Open MCP inspector

🔧 MCP Client Configuration

Option 1: PyPI package (Recommended) Simplest setup using the published PyPI package:

{
  "mcpServers": {
    "bigquery": {
      "command": "uvx",
      "args": [
        "bigquery-mcp",
        "--project", "your-project-id",
        "--location", "US"
     ]
    }
  }
}

Option 2: Local clone (for development)

# Clone first
git clone https://github.com/pvoo/bigquery-mcp.git

{
  "mcpServers": {
    "bigquery": {
      "command": "uv",
      "args": ["--directory", "/absolute/path/to/bigquery-mcp", "run", "bigquery-mcp"],
      "env": {
        "GCP_PROJECT_ID": "your-project-id",
        "BIGQUERY_LOCATION": "US"
      }
    }
  }
}

🧪 Test Your Setup

# Test with MCP inspector
npx @modelcontextprotocol/inspector uvx bigquery-mcp --project YOUR_PROJECT --location US

🔧 Configuration Options

All configuration can be set via CLI arguments or environment variables. CLI arguments take precedence.

Required Parameters

--project YOUR_PROJECT    # Google Cloud project ID
--location US             # BigQuery location (US, EU, etc.)

Optional Parameters

# Dataset Access Control
--datasets dataset1 dataset2    # Restrict to specific datasets (default: all datasets)

# Query & Result Limits
--list-max-results 500          # Max results for basic list operations (default: 500)
--detailed-list-max 25          # Max results for detailed list operations (default: 25)

# Table Analysis
--sample-rows 3                 # Sample data rows returned in get_table (default: 3)
--stats-sample-size 500         # Rows sampled for column fill rate calculations (default: 500)

# Authentication
--key-file /path/to/key.json    # Service account key file (default: ADC)

Environment Variables

All CLI options have corresponding environment variables:

export GCP_PROJECT_ID=your-project
export BIGQUERY_LOCATION=US
export BIGQUERY_ALLOWED_DATASETS=dataset1,dataset2
export BIGQUERY_LIST_MAX_RESULTS=500
export BIGQUERY_LIST_MAX_RESULTS_DETAILED=25
export BIGQUERY_SAMPLE_ROWS=3
export BIGQUERY_SAMPLE_ROWS_FOR_STATS=500
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json

Vector Search Configuration

See Vector Search section for full setup instructions.

--embedding-model project.dataset.model
--embedding-tables dataset.table1 dataset.table2
--distance-type COSINE

🛠️ Tools Overview

This MCP server provides 5 BigQuery tools optimized for LLM efficiency:

📊 Smart Dataset & Table Discovery

list_datasets - Dual mode: basic (names only) vs detailed (full metadata)
list_tables - Context-aware table browsing with optional schema details
get_table - Complete table analysis with schema and sample data

🔍 Safe Query Execution

run_query - Execute SELECT/WITH queries only, with cost tracking and safety validation. Use LIMIT clause in queries to control result size.

🔮 Vector Search (Optional)

vector_search - Dual-mode tool: discover embedding tables (no query_text) or perform semantic similarity search (with query_text)

Key Features:

✅ Minimal by default - 70% fewer tokens in basic mode
✅ Safe queries only - Blocks all write operations
✅ LLM-optimized - Returns structured data perfect for AI analysis
✅ Cost transparent - Shows bytes processed for each query

🔮 Vector Search (Optional)

Enable semantic similarity search using BigQuery vector embeddings.

Prerequisites: Setting Up Embeddings in BigQuery

Before using vector search, you need an embedding model and tables with embeddings:

Step 1: Create a Vertex AI connection (one-time setup)

-- In BigQuery console or bq command line
-- This creates a connection to Vertex AI for generating embeddings
CREATE EXTERNAL CONNECTION `your-project.your-region.vertex-ai`
  OPTIONS (
    endpoint = 'https://your-region-aiplatform.googleapis.com',
    type = 'CLOUD_RESOURCE'
  );

Step 2: Create the embedding model

CREATE OR REPLACE MODEL `your-project.your_dataset.text_embedding_model`
REMOTE WITH CONNECTION `your-project.your-region.vertex-ai`
OPTIONS (ENDPOINT = 'text-embedding-005');

Step 3: Add embeddings to your table

-- Add embedding column to existing table
ALTER TABLE `your-project.your_dataset.products`
ADD COLUMN IF NOT EXISTS embedding ARRAY<FLOAT64>;

-- Generate embeddings for your text data
UPDATE `your-project.your_dataset.products` t
SET embedding = (
  SELECT ml_generate_embedding_result
  FROM ML.GENERATE_EMBEDDING(
    MODEL `your-project.your_dataset.text_embedding_model`,
    (SELECT t.name AS content),
    STRUCT(TRUE AS flatten_json_output)
  )
)
WHERE embedding IS NULL;

See BigQuery text embeddings documentation for detailed setup instructions and connection permissions.

MCP Configuration for Vector Search

Once you have embeddings set up, configure the MCP server:

{
  "mcpServers": {
    "bigquery": {
      "command": "uvx",
      "args": [
        "bigquery-mcp",
        "--project", "your-project",
        "--location", "US",
        "--embedding-model", "your-project.your_dataset.text_embedding_model",
        "--embedding-tables", "your_dataset.products", "your_dataset.documents"
      ]
    }
  }
}

Configuration Reference

CLI Argument	Environment Variable	Default	Description
`--embedding-model`	`BIGQUERY_EMBEDDING_MODEL`	-	Required. Full path to embedding model (`project.dataset.model`). Validated on startup.
`--embedding-tables`	`BIGQUERY_EMBEDDING_TABLES`	-	Tables with embedding columns (skips auto-discovery)
`--vector-column-contains`	`BIGQUERY_EMBEDDING_COLUMN_CONTAINS`	`embedding`	Pattern for finding embedding columns (column name must contain this)
`--distance-type`	`BIGQUERY_DISTANCE_TYPE`	`COSINE`	Distance metric: `COSINE`, `EUCLIDEAN`, `DOT_PRODUCT`
`--no-vector-search`	`BIGQUERY_VECTOR_SEARCH_ENABLED=false`	enabled	Disable vector search tools

Usage Examples

Discovery mode - find tables with embeddings:

{
  "query_text": ""
}

Search mode - semantic similarity search:

{
  "query_text": "solenoid valve for water",
  "table_path": "my_dataset.products",
  "top_k": "10",
  "select_columns": "name,description,price"
}

Required Permissions

Role	Purpose
`roles/bigquery.dataViewer`	Read tables and models
`roles/bigquery.jobUser`	Run BigQuery jobs
`roles/bigquery.metadataViewer`	Auto-discover embedding tables (optional)

🏗️ Development Setup

Local Development

# Clone and setup
git clone https://github.com/pvoo/bigquery-mcp.git
cd bigquery-mcp
make install  # Setup environment + pre-commit hooks

# Development workflow
make run      # Start server
make test     # Run test suite
make check    # Lint + format + typecheck
make inspect  # Launch MCP inspector

Testing & Quality

make test                    # Full test suite
pytest tests/test_safety.py  # SQL safety validation tests
pytest tests/test_server.py  # Core server functionality tests
make check                   # Run all quality checks

🔐 Authentication & Permissions

Authentication Methods:

Application Default Credentials (recommended): gcloud auth application-default login
Service Account Key: Use --key-file or set GOOGLE_APPLICATION_CREDENTIALS

Required BigQuery Permissions:

bigquery.datasets.get, bigquery.datasets.list
bigquery.tables.list, bigquery.tables.get
bigquery.jobs.create, bigquery.data.get

🚨 Troubleshooting

Authentication Issues:

# Check current auth
gcloud auth application-default print-access-token

# Re-authenticate
gcloud auth application-default login

# Enable BigQuery API
gcloud services enable bigquery.googleapis.com

MCP Connection Issues:

Ensure absolute paths in MCP config
Test server manually: make run
Check that project and location environment variables or args are set correctly

Performance Issues:

Use {"detailed": false} for faster responses
Add search filters: {"search": "pattern"}
Reduce max_results for large datasets

💡 Usage Examples

📊 SQL Query Example

-- Query public datasets
SELECT
    EXTRACT(YEAR FROM pickup_datetime) as year,
    COUNT(*) as trips,
    ROUND(AVG(fare_amount), 2) as avg_fare
FROM `bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2020`
WHERE pickup_datetime BETWEEN '2020-01-01' AND '2020-12-31'
GROUP BY year
LIMIT 20

🤖 Example: Usage with Claude Code subagent

Scenario: Use the specialized BigQuery Table Analyst agent in Claude Code to automatically explore your data warehouse, analyze table relationships, and provide structured insights. By using the subagent you can take the context used for analyzing the tables out of the main thread and return actionable insights into the main agent thread for writing SQL or analyzing.

Setup:

# 1. Clone and configure
git clone https://github.com/pvoo/bigquery-mcp.git
cd bigquery-mcp

# 2. Setup environment
export GCP_PROJECT_ID="your-project-id"
export BIGQUERY_LOCATION="US"
gcloud auth application-default login

# 3. Launch Claude Code
claude-code

Example Usage:

💬 You: "I need to understand our sales data structure and find tables related to customer orders"

🤖 Claude: I'll use the BigQuery Table Analyst agent to explore your sales datasets and identify relevant tables with their relationships.

[Agent automatically:]
- Lists all datasets to identify sales-related ones
- Explores table schemas with detailed metadata
- Shows actual sample data from key tables
- Discovers join relationships between tables
- Provides ready-to-use SQL queries

What the Agent Returns:

Table schemas with column descriptions and types
Sample data showing actual values (not placeholders)
Join relationships with working SQL examples
Data quality insights (null rates, freshness, etc.)
Actionable SQL queries you can immediately execute

🤝 Contributing

We welcome contributions! Looking forward to your feedback for improvements.

Quick Start:

# Fork on GitHub, then:
git clone https://github.com/yourusername/bigquery-mcp.git
cd bigquery-mcp
make install  # Setup dev environment
make check    # Verify everything works

# Make changes, then:
make test     # Run tests
make check    # Quality checks
# Submit PR!

Development Guidelines:

Add tests for new features
Update documentation
Follow existing code style (enforced by pre-commit hooks)
Ensure all quality checks pass

Found an issue or have a feature request?

🐛 Bug reports: Open an issue
🔧 Code improvements: Submit a pull request
📖 Documentation: See CONTRIBUTING.md

🌟 Star this repo if it helps you!

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

pvoo

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.4

Dec 11, 2025

0.1.3

Aug 15, 2025

0.1.2

Aug 14, 2025

0.1.1

Aug 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigquery_mcp-0.1.4.tar.gz (140.6 kB view details)

Uploaded Dec 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bigquery_mcp-0.1.4-py3-none-any.whl (23.9 kB view details)

Uploaded Dec 11, 2025 Python 3

File details

Details for the file bigquery_mcp-0.1.4.tar.gz.

File metadata

Download URL: bigquery_mcp-0.1.4.tar.gz
Upload date: Dec 11, 2025
Size: 140.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bigquery_mcp-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`9451242dc4ce959aa6ada5280e5dfbe5b8d9a3f07dc7ee736294807f3212ace8`
MD5	`b04890a0384fe9c2cf0fba256a5bd042`
BLAKE2b-256	`7674a331293e545ba8c3dd6f1f09e3e46acf6a0a6f1c4a2eb5c6273f3020fa7e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bigquery_mcp-0.1.4.tar.gz:

Publisher: release.yml on pvoo/bigquery-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bigquery_mcp-0.1.4.tar.gz
- Subject digest: 9451242dc4ce959aa6ada5280e5dfbe5b8d9a3f07dc7ee736294807f3212ace8
- Sigstore transparency entry: 760316194
- Sigstore integration time: Dec 11, 2025
Source repository:
- Permalink: pvoo/bigquery-mcp@37009651f06d3dba7256e8ae1499533560091c9a
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/pvoo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@37009651f06d3dba7256e8ae1499533560091c9a
- Trigger Event: release

File details

Details for the file bigquery_mcp-0.1.4-py3-none-any.whl.

File metadata

Download URL: bigquery_mcp-0.1.4-py3-none-any.whl
Upload date: Dec 11, 2025
Size: 23.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bigquery_mcp-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`74093e01a62675c1336b633a6f5d81a2414a6ca042273fb346ef34e5b514c78c`
MD5	`08d96b33e10c6cf60c85f053c2801938`
BLAKE2b-256	`8d11d7c7e498884f8e9ddf2496e7978098db12cfacde4b2dc7d44bd1b2e5cd7e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bigquery_mcp-0.1.4-py3-none-any.whl:

Publisher: release.yml on pvoo/bigquery-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bigquery_mcp-0.1.4-py3-none-any.whl
- Subject digest: 74093e01a62675c1336b633a6f5d81a2414a6ca042273fb346ef34e5b514c78c
- Sigstore transparency entry: 760316198
- Sigstore integration time: Dec 11, 2025
Source repository:
- Permalink: pvoo/bigquery-mcp@37009651f06d3dba7256e8ae1499533560091c9a
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/pvoo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@37009651f06d3dba7256e8ae1499533560091c9a
- Trigger Event: release

bigquery-mcp 0.1.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🗂️ BigQuery MCP Server

Quick Start

🚀 Quick Setup

🔧 MCP Client Configuration

🧪 Test Your Setup

🔧 Configuration Options

Required Parameters

Optional Parameters

Environment Variables

Vector Search Configuration

🛠️ Tools Overview

📊 Smart Dataset & Table Discovery

🔍 Safe Query Execution

🔮 Vector Search (Optional)

🔮 Vector Search (Optional)

Prerequisites: Setting Up Embeddings in BigQuery

MCP Configuration for Vector Search

Configuration Reference

Usage Examples

Required Permissions

🏗️ Development Setup

Local Development

Testing & Quality

🔐 Authentication & Permissions

🚨 Troubleshooting

💡 Usage Examples

📊 SQL Query Example

🤖 Example: Usage with Claude Code subagent

🤝 Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance