Skip to main content

MCP Server for interacting with Statistics Canada Web Data Services API

Project description

Statistics Canada MCP Server

📊 Statistics Canada API MCP Server

Python 3.10+ License: MIT MCP GitHub

📝 Description

This project implements a Model Context Protocol (MCP) server that provides tools for interacting with Statistics Canada (StatCan) data APIs. It allows LLMs or other MCP clients to access and retrieve Canadian statistical data in a structured way.

-- Please note LLM's may fabricate data, current MCP integration is limited to basic data discovery, and data exploration using a SQLite database, always verify original tables and keep track of changes you let LLM's (download, update, delete, checking your pc memory) --

The server is built using the FastMCP library and interacts with the StatCan Web Data Service via httpx.

📑 Table of Contents

💬 Claude Chat Examples

Dataset Query Example Demo Data Source
Canada's Greenhouse Gas Emissions (2018-2022) "Hey Claude! Can you please create a simple visualization for greenhouse emissions for Canada as a whole over the last 4 years?" View Demo StatCan Table
Canada's International Trade in Services "Hey Claude, can you create a quick analysis for international trade in services for the last 6 months. Create a visualization with key figures please!" View Demo StatCan Table
Ontario Building Construction Price Index "Hey Claude! Can you please generate a visualization for Ontario's Building Price index from Q4 2023 to Q4 2024. Thanks!" View Demo StatCan Table

Effective Querying Tips

To get the most accurate results from Claude when using this Statistics Canada MCP server:

  • Be Specific: Use precise, well-formed requests with exact details about the data you need
  • Provide Context: Clearly specify tables, vectors, time periods, and geographical areas
  • Avoid Typos: Double-check spelling of statistical terms and place names
  • Structured Questions: Break complex queries into clear, logical steps
  • Verify Results: Always cross-check important data against official Statistics Canada sources

⚠️ Warning: LLMs like Claude may occasionally create mock visualizations or fabricate data when unable to retrieve actual information. They might also generate responses with data not available in Statistics Canada to satisfy queries. Always verify results against official sources.

✨ Features

This server exposes StatCan API functionalities as MCP tools, including:

API Functionality

Cube Operations:

  • Listing all available data cubes/tables (full and lite versions)
  • Searching cubes by title
  • Retrieving detailed cube metadata
  • Getting data for the latest N periods based on ProductId and Coordinate
  • Getting series info based on ProductId and Coordinate
  • Getting changed series data based on ProductId and Coordinate
  • Listing cubes changed on a specific date
  • Providing download links for full cubes (CSV/SDMX) (Discouraged)

Vector Operations:

  • Retrieving series metadata by Vector ID
  • Getting data for the latest N periods by Vector ID
  • Getting data for multiple vectors by reference period range
  • Getting bulk data for multiple vectors by release date range
  • Getting changed series data by Vector ID
  • Listing series changed on a specific date

Database Functionality

The server automatically creates a SQLite database (temp_statcan_data.db) for:

  • Creating tables from API data
  • Inserting data into tables
  • Querying the database with SQL
  • Viewing table schemas and listing available tables

This allows for persistent storage of retrieved data and more complex data manipulation through SQL.

(Refer to the specific tool functions within src/api/ for detailed parameters and return types.)

🏗️ Project Structure

  • src/: Contains the main source code for the MCP server.
  • api/: Defines the MCP tools wrapping the StatCan API calls (cube_tools.py, vector_tools.py, metadata_tools.py).
  • db/: Handles database interactions, including connection, schema, and queries.
  • models/: Contains Pydantic models for API request/response validation and database representation.
  • util/: Utility functions (e.g., coordinate padding).
  • config.py: Configuration loading (e.g., database credentials, API base URL).
  • server.py: Main FastMCP server definition and tool registration.
  • __init__.py: Package initialization for src.
  • pyproject.toml: Project dependency and build configuration.
  • .env: (Assumed) Used for storing sensitive configuration like database credentials, loaded by src/config.py.

📥 Installation Guide for Beginners

If you're new to Python or programming in general, follow these simple steps to get started:

  1. Install Python (version 3.10 or higher):
  • Download from python.org
  • Make sure to check "Add Python to PATH" during installation
  1. Install uv (a fast Python package installer):
# Open your Terminal (Mac/Linux) or Command Prompt (Windows) and run:
curl -fsSL https://astral.sh/uv/install.sh | bash
# Or on Windows:
# curl.exe -fsSL https://astral.sh/uv/install.ps1 -o install.ps1; powershell -ExecutionPolicy Bypass -File install.ps1
  1. Install fastmcp:
uv pip install fastmcp httpx pydantic
  1. Download this project:
git clone https://github.com/Aryan-Jhaveri/mcp-statcan.git
cd mcp-statcan

Tip: If you encounter any "module not found" errors, install the missing package with:

uv pip install package_name

🔧 Setting Up Claude Desktop Configuration

To integrate with Claude Desktop:

  1. Manually edit the generated config in your claude_desktop_config.json:

Navigate to: Claude Desktop App → Settings (⌘ + ,) → Developer → Edit Config

{
"mcpServers": {
"StatCanAPI_DB_Server": {
"command": "uv",
"args": [
  "run",
  "--with", "fastmcp",
  "--with", "httpx", 
  "sh",
  "-c",
  "cd /path/to/mcp-statcan && python -m src.server"
]
}
}
}

Replace /path/to/mcp-statcan with the absolute path to your project directory. The manual edit is necessary to ensure the server runs with the correct working directory context for proper module resolution.

⚠️ Known Issues and Limitations

  • SSL Verification: Currently disabled for development. Should be enabled for production use.
  • LLM Defaults to One-by-One Fetching: LLMs tend to default to get_data_from_cube_pid_coord_and_latest_n_periods in a loop (one API call per data point) instead of using bulk vector tools like get_data_from_vector_by_reference_period_range which accept arrays of vector IDs. This is slower, wastes API calls, and increases the risk of the LLM fabricating numbers when it loses patience mid-loop. Best practice: Use bulk vector fetch → DB storage → SQL query.
  • create_table_from_data Does Not Insert Data: This tool only scaffolds the SQLite schema — it does not populate rows. You must follow up with a separate insert_data_into_table call. LLMs frequently assume the table is populated after creation, leading to empty query results and confusion.
  • Data Validation: Always cross-check your data with official Statistics Canada sources.
  • Security Concerns: Query validation is basic; avoid using with untrusted input.
  • Performance: Some endpoints may timeout with large data requests.
  • API Rate Limits: The StatCan API may impose rate limits that affect usage during high-demand periods.

🚀 Usage Examples

API Operations

# Search for data tables about employment
tables = await search_cubes_by_title("employment")

# Get recent data points for a specific vector ID
data = await get_data_from_vectors_and_latest_n_periods(VectorLatestNInput(vectorId=12345, latestN=5))

# Get data for a specific range of periods
range_data = await get_data_from_vector_by_reference_period_range(
VectorPeriodRangeInput(vectorId=12345, startDate="2020-01-01", endDate="2020-12-31")
)

Database Operations

# Store API results in SQLite database
create_table_from_data(TableDataInput(table_name="employment_data", data=data))

# Query the database
result = query_database(QueryInput(sql_query="SELECT * FROM employment_data LIMIT 10"))

# List available tables
tables = list_tables()

# Get schema for a table
schema = get_table_schema(TableSchemaInput(table_name="employment_data"))

Made with ❤️❤️❤️ for Statistics Canada

GitHubReport BugStatistics Canada

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statcan_mcp_server-0.1.2.tar.gz (135.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

statcan_mcp_server-0.1.2-py3-none-any.whl (29.8 kB view details)

Uploaded Python 3

File details

Details for the file statcan_mcp_server-0.1.2.tar.gz.

File metadata

  • Download URL: statcan_mcp_server-0.1.2.tar.gz
  • Upload date:
  • Size: 135.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for statcan_mcp_server-0.1.2.tar.gz
Algorithm Hash digest
SHA256 fab1e5489d1f263771caf89c566f5ebc3e4058de71433ca72879c318e0cad431
MD5 79af49a37b71b8853ff8dcc5ea8065e6
BLAKE2b-256 f3b4caf9fd6c66894ebb7b9201db83664f47a427bebba35fae4cdd07abd8101e

See more details on using hashes here.

Provenance

The following attestation bundles were made for statcan_mcp_server-0.1.2.tar.gz:

Publisher: publish-mcp-registry.yml on Aryan-Jhaveri/mcp-statcan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file statcan_mcp_server-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for statcan_mcp_server-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 60aa6a647a56060499161c16f0e5cf9bcc85c6f46bf3d5264e4a5bb3e0302c8e
MD5 b53ea5f3812ea8af9d1fc7701fb48109
BLAKE2b-256 0381fc37e1cf91bfba1f3214ab446a3c1c5ee3c456bb7749f05fcb89a6a61474

See more details on using hashes here.

Provenance

The following attestation bundles were made for statcan_mcp_server-0.1.2-py3-none-any.whl:

Publisher: publish-mcp-registry.yml on Aryan-Jhaveri/mcp-statcan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page