Skip to main content

MCP server for safe, validated, cost-aware access to Copernicus Earth observation data.

Project description

copernicus-mcp

copernicus-mcp is a Model Context Protocol (MCP) server that gives LLM agents and CLI users safe, validated, cost-aware, reproducible access to Copernicus Earth observation data. It exposes discovery, estimation and subset-download workflows as MCP tools, returns large scientific data as file descriptors (filepath + metadata + provenance) rather than inline bytes, and produces a deterministic provenance record for every retrieval.

Status

Iteration 1 — Marine-first walking skeleton. Iteration 1 ships the Copernicus Marine (CMEMS) backend through the official copernicusmarine toolbox. Climate Data Store (CDS), Atmosphere Data Store (ADS), Early Warning Data Store (EWDS), Copernicus Data Space Ecosystem (CDSE), Sentinel Hub and WEkEO are planned for subsequent iterations and are explicitly out of scope for Iteration 1.

Quick start

# 1. Create and activate a virtual environment.
python -m venv .venv && source .venv/bin/activate

# 2. Install the package with the CMEMS backend.
pip install "copernicus-mcp[cmems]"

# 3. Configure CMEMS credentials (free account at
#    https://data.marine.copernicus.eu/register).
#    Recommended — the toolbox writes the credentials file once:
copernicusmarine login
#    Alternative — environment variables in your shell profile:
# export COPERNICUSMARINE_SERVICE_USERNAME=your_user
# export COPERNICUSMARINE_SERVICE_PASSWORD=your_pass

# 4. Try a search from the terminal.
copernicus-mcp marine search-datasets --keyword temperature --limit 3

# 5. Run the MCP server (used by Claude Desktop / Claude Code / any
#    MCP-compatible client over stdio). See "Claude Desktop integration"
#    below.
copernicus-mcp serve

Features

Iteration 1 implements the full MCP-core infrastructure so subsequent iterations add backends as small additive changes. Today the package provides:

  • Tools (CMEMS): marine_search_datasets, marine_describe_dataset, marine_estimate_subset, marine_subset_dataset, plus a copernicus_mcp_status diagnostic.
  • Resources: copernicus://datasets/cmems/{id}, copernicus://files/{cache_key}, copernicus://provenance/{record_id}.
  • CLI (Typer + Rich): copernicus-mcp {serve, version, status, marine ...} with a global --json flag for scripting.
  • Confirmation flow: large or approximate-estimate subsets gate on a structured confirmation prompt before any download.
  • Cache + provenance: each retrieval produces a sidecar JSON record with file MD5, software versions, request envelope, and a deterministic cache key.
  • Sanitisation: defence-in-depth redaction of credential-shaped strings on every outbound payload.
  • Structured errors: eleven canonical error classes with recovery hints (e.g. recovery_action="configure_credentials").
  • Cancellation discipline: asyncio.CancelledError propagates without being wrapped, per project invariant.

Why this exists

LLM agents can already call APIs, but for scientific data three properties matter and are easy to lose:

  1. Reproducibility — the agent can hand a colleague the exact request and get the exact same file back tomorrow.
  2. Cost-awareness — multi-gigabyte downloads should be confirmed, not silently triggered by a fuzzy prompt.
  3. Credential isolation — credentials must never leak into tool output, logs, or provenance, regardless of the prompt or the upstream library's exception messages.

copernicus-mcp enforces all three at the protocol layer, so the agent does not need to.

Tool reference, in brief

  • marine_search_datasets (MCP tool) / copernicus-mcp marine search-datasets (CLI) — discover dataset ids by keyword, bbox, time range, or service type. Returns {datasets, total_count}.
  • marine_describe_dataset / marine describe DATASET_ID — full metadata for a single dataset: variables, axes, services, terms.
  • marine_estimate_subset / marine estimate ... — preview byte size and confirmation status for a subset request without downloading. Use this before large requests.
  • marine_subset_dataset / marine subset ... — download a spatio-temporal subset. Returns {filepath, uri, metadata, provenance} — never inline bytes. Large requests gate on a structured confirmation prompt.
  • copernicus_mcp_status / status — server diagnostics: backends, credential sources (without values), cache metrics, configuration snapshot.

For complete schemas, options and exit codes, run copernicus-mcp marine subset --help or read the inline tool descriptions surfaced by your MCP client (each tool's docstring is its protocol description).

Claude Desktop integration

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or the equivalent on your platform:

{
  "mcpServers": {
    "copernicus": {
      "command": "copernicus-mcp",
      "args": ["serve"]
    }
  }
}

Restart Claude Desktop. The five tools listed above become available to the assistant. Tool results that wrap large data return a filepath plus metadata and provenance — never inline bytes.

Credentials

copernicus-mcp resolves CMEMS credentials in this precedence:

  1. Toolbox credentials file (recommended): ~/.copernicusmarine/.copernicusmarine-credentials. Created by running copernicusmarine login once. The same file is used by the official CLI and by us — set it once, share it across tools.
  2. Environment variables in your shell profile: COPERNICUSMARINE_SERVICE_USERNAME and COPERNICUSMARINE_SERVICE_PASSWORD. Convenient on CI or in a project-local direnv setup.
  3. (Possible but not recommended for the desktop client) env: {...} block inside claude_desktop_config.json. The file lives in plain text and gets backed up by macOS / cloud sync, so credentials embedded there leave a wider trace than necessary.

Verify resolution: copernicus-mcp status --json | jq '.backends.cmems'. The output reports credential_source as config_file, env, or missing — the actual values are never printed.

Configuration

The system is usable with no configuration file at all — every Pydantic field has a sensible default. Override via environment variables (COPERNICUS_MCP_LOG_LEVEL, COPERNICUS_MCP_CACHE_DIR, COPERNICUS_MCP_STATE_DB, plus COPERNICUS_MCP_<SECTION>__<FIELD> for nested fields), or with a YAML file at ~/.config/copernicus-mcp/config.yaml or ~/.copernicus-mcp.yaml.

State directories: ~/.cache/copernicus-mcp/ (downloaded files + .provenance.json sidecars), ~/.local/state/copernicus-mcp/state.db (SQLite cache index, workflow rows, persisted provenance).

Troubleshooting

  • AuthError on tool call → run copernicus-mcp status and check backends.cmems.configured. If false, your env vars are not visible to the running process (common Claude Desktop pitfall — restart the client after editing config) or the credentials file is missing/unreadable.
  • CoverageUnavailableError → bbox or time range is outside the dataset's actual extent. Use marine_describe_dataset to inspect coverage and narrow the request.
  • ValidationError with recovery_action="modify_request_parameters" → request was structurally invalid (e.g. inverted bbox, antimeridian-crossing bbox, naive datetime). The next_action_hint field tells you exactly how to fix it.
  • Subset hangs → set COPERNICUS_MCP_LOG_LEVEL=DEBUG and watch for retry messages. Reduce bbox or time range if request is genuinely large.

License

BSD 3-Clause. See LICENSE. Dependencies are EUPL-1.2 (copernicusmarine), Apache-2.0, MIT or BSD. Iteration 1 does not depend on sentinelhub-py; when the Sentinel Hub backend lands in a later iteration, this section will document the relevant CC BY-NC restriction on its SDK.

Acknowledgements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

copernicus_mcp-0.1.2.tar.gz (64.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

copernicus_mcp-0.1.2-py3-none-any.whl (85.8 kB view details)

Uploaded Python 3

File details

Details for the file copernicus_mcp-0.1.2.tar.gz.

File metadata

  • Download URL: copernicus_mcp-0.1.2.tar.gz
  • Upload date:
  • Size: 64.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for copernicus_mcp-0.1.2.tar.gz
Algorithm Hash digest
SHA256 335ecf085cc436cd8d3383d79d0ccfdda39b8fc1df883871601c3c9b25501522
MD5 7556424a3263d159730b03b6c10f4df8
BLAKE2b-256 5a506df9fe7c84587a0bdfc49fad03e1c2b96ed5c9750ee21ac36fa9657cea65

See more details on using hashes here.

File details

Details for the file copernicus_mcp-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: copernicus_mcp-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 85.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for copernicus_mcp-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2d02dfb2be7e1a2b68628fb5517df083e855ff65a1d01a67624c43759545a779
MD5 1c3e61be3519e3edf4b2bd633e5584da
BLAKE2b-256 6ceaa1e25ee983377fe2d3b687d4af9e86d989c1b970a996667f79f439217e13

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page