Skip to main content

MCP server for safe, validated, cost-aware access to Copernicus Earth observation data.

Project description

copernicus-mcp

copernicus-mcp is a Model Context Protocol (MCP) server that gives LLM agents and CLI users safe, validated, cost-aware, reproducible access to Copernicus Earth observation data. It exposes discovery, estimation and subset-download workflows as MCP tools, returns large scientific data as file descriptors (filepath + metadata + provenance) rather than inline bytes, and produces a deterministic provenance record for every retrieval.

Status

Two backends shipped:

  • Copernicus Marine (CMEMS) through the official copernicusmarine toolbox — discovery, estimation, synchronous and async subset retrieval.
  • Climate Data Store family (CDS / ADS / EWDS) through cdsapi (single PAT works across all three stores) — discovery via a bundled catalogue snapshot, heuristic estimation, full async lifecycle (submit / poll / download / cancel), and T&C-not-accepted elicitation.

Copernicus Data Space Ecosystem (CDSE), Sentinel Hub and WEkEO are planned for subsequent iterations and are out of scope today.

Quick start

# 1. Create and activate a virtual environment.
python -m venv .venv && source .venv/bin/activate

# 2. Install the package with the backends you need.
pip install "copernicus-mcp[cmems,cds]"      # both backends
# pip install "copernicus-mcp[cmems]"        # CMEMS only
# pip install "copernicus-mcp[cds]"          # CDS / ADS / EWDS only

# 3. Configure credentials for the backend(s) you installed.
#    CMEMS (free account at https://data.marine.copernicus.eu/register):
copernicusmarine login
# CDS / ADS / EWDS — single PAT works across all three stores (free
# account at https://cds.climate.copernicus.eu/):
# export CDSAPI_KEY=<your-uuid-pat>
#    or populate ~/.cdsapirc as the cdsapi CLI expects.

# 4. Try a search from the terminal.
copernicus-mcp marine search-datasets --keyword temperature --limit 3
# The `cds` subcommands require opting the backend in — see "Credentials"
# below; otherwise the call exits with `backend_not_configured`.
COPERNICUS_MCP_ENABLED_BACKENDS=cmems,cds \
  copernicus-mcp cds search --keyword reanalysis --limit 3

# 5. Run the MCP server (used by Claude Desktop / Claude Code / any
#    MCP-compatible client over stdio). See "Claude Desktop integration"
#    below.
copernicus-mcp serve

Features

The package provides:

  • Tools (CMEMS)marine_search_datasets, marine_describe_dataset, marine_estimate_subset, marine_subset_dataset, marine_check_status, marine_cancel_subset.
  • Tools (CDS / ADS / EWDS)cds_search_datasets, cds_describe_dataset, cds_estimate_request, cds_submit_request, cds_check_request_status, cds_download_request_result, cds_cancel_request.
  • Diagnosticcopernicus_mcp_status reports configured backends, credential sources, cache metrics, and config snapshot (credential values never appear).
  • Resourcescopernicus://datasets/cmems/{id}, copernicus://files/{cache_key}, copernicus://jobs/{request_id}, copernicus://provenance/{record_id}.
  • CLI (Typer + Rich) — copernicus-mcp {serve, version, status, marine ..., cds ...} with a global --json flag for scripting.
  • Confirmation flow: large or approximate-estimate subsets gate on a structured confirmation prompt before any download.
  • Cache + provenance: each retrieval produces a sidecar JSON record with file MD5, software versions, request envelope, and a deterministic cache key.
  • Sanitisation: defence-in-depth redaction of credential-shaped strings on every outbound payload.
  • Structured errors: eleven canonical error classes with recovery hints (e.g. recovery_action="configure_credentials").
  • Cancellation discipline: asyncio.CancelledError propagates without being wrapped, per project invariant.

Why this exists

LLM agents can already call APIs, but for scientific data three properties matter and are easy to lose:

  1. Reproducibility — the agent can hand a colleague the exact request and get the exact same file back tomorrow.
  2. Cost-awareness — multi-gigabyte downloads should be confirmed, not silently triggered by a fuzzy prompt.
  3. Credential isolation — credentials must never leak into tool output, logs, or provenance, regardless of the prompt or the upstream library's exception messages.

copernicus-mcp enforces all three at the protocol layer, so the agent does not need to.

Tool reference, in brief

CMEMS (synchronous + async download)

  • marine_search_datasets / marine search-datasets — discover dataset ids by keyword, bbox, time range, or service type. Returns {datasets, total_count}.
  • marine_describe_dataset / marine describe DATASET_ID — full metadata: variables, axes, services, derived spatial_extent (from variable bboxes), DOI.
  • marine_estimate_subset / marine estimate ... — preview byte size and confirmation status without downloading. Returns a coverage_advisory when the user bbox does not align with the dataset's extent.
  • marine_subset_dataset / marine subset ... — download a spatio-temporal subset. Returns {filepath, uri, metadata, provenance} — never inline bytes. Large requests gate on a structured confirmation; pass confirmed=true (MCP) or --yes (CLI) on the second call. async_mode=true returns a request_id immediately and the download runs in the background.
  • marine_check_status / marine wait REQUEST_ID — poll the workflow row for an in-flight or completed async submit.
  • marine_cancel_subset / (no CLI subcommand — use the MCP tool) — cancel an in-flight async submit; best-effort, the underlying toolbox thread may run to completion.

CDS / ADS / EWDS (async-by-design, queue-backed)

  • cds_search_datasets / cds search — discover dataset ids via a bundled catalogue snapshot covering all three stores. Slim records (~30k tokens for the full catalogue) suitable for LLM context.
  • cds_describe_dataset / cds describe DATASET_ID — full STAC item for one dataset (description, extent, keywords, license, store, variables).
  • cds_estimate_request / cds estimate ... — heuristic byte-size estimator + queue-tier classification (light / medium / heavy). epistemic_status is always approximate.
  • cds_submit_request / cds submit ... — queue a retrieve. Returns {status: "queued", request_id, cache_key} immediately. Confirmation gate fires for large bytes OR queue tier medium/heavy; second call with confirmed=true proceeds. area ordering: CDS uses [north, west, south, east], opposite of common GIS [w, s, e, n] — see the inputs field description; sending the wrong order silently retrieves the wrong region.
  • cds_check_request_status / cds check-status REQUEST_ID (or cds wait REQUEST_ID) — poll the workflow row; wait blocks until terminal up to a configurable timeout.
  • cds_download_request_result / cds download REQUEST_ID — fetch the result file from the canonical cache once status is successful.
  • cds_cancel_request / cds cancel REQUEST_ID — cancel a queued or running request; idempotent on already-terminal rows.

A T&C-not-accepted server response is surfaced as the canonical TermsNotAcceptedError with recovery_url pointing at the licence page — open the URL, accept the licence, and re-submit.

For complete schemas, options and exit codes, run copernicus-mcp <subcommand> --help or read the inline tool descriptions surfaced by your MCP client.

Claude Desktop integration

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or the equivalent on your platform:

{
  "mcpServers": {
    "copernicus": {
      "command": "copernicus-mcp",
      "args": ["serve"]
    }
  }
}

Restart Claude Desktop. Every tool whose backend is configured becomes available. Tool results that wrap large data return a filepath plus metadata and provenance — never inline bytes.

Credentials

CMEMS

Resolution precedence:

  1. Toolbox credentials file (recommended): ~/.copernicusmarine/.copernicusmarine-credentials. Created by running copernicusmarine login once. The same file is used by the official CLI and by us — set it once, share it across tools.
  2. Environment variables: COPERNICUSMARINE_SERVICE_USERNAME and COPERNICUSMARINE_SERVICE_PASSWORD. Convenient on CI or in a project-local direnv setup.
  3. (Possible but not recommended for the desktop client) env: {...} block inside claude_desktop_config.json. The file lives in plain text and gets backed up by macOS / cloud sync, so credentials embedded there leave a wider trace than necessary.

CDS / ADS / EWDS

A single Personal Access Token (PAT) — a canonical UUID — works across all three stores per ECMWF policy. Resolution precedence:

  1. Environment variable: CDSAPI_KEY=<your-uuid-pat> in your shell profile.
  2. ~/.cdsapirc (the location the official cdsapi CLI uses). YAML-ish two-line file:
    url: https://cds.climate.copernicus.eu/api
    key: <your-uuid-pat>
    

Get a PAT: log in at https://cds.climate.copernicus.eu/, open the user-profile page, and copy the "Personal Access Token" UUID. Each new dataset requires accepting its licence once (the page is linked in the recovery URL of TermsNotAcceptedError).

By default only CMEMS is enabled; opt in to CDS with enabled_backends: [cmems, cds] in ~/.config/copernicus-mcp/config.yaml, or COPERNICUS_MCP_ENABLED_BACKENDS=cmems,cds in your env.

Verify resolution: copernicus-mcp status --json | jq '.backends'. The output reports credential_source as config_file, env, or missing for each backend — the actual values are never printed.

Configuration

The system is usable with no configuration file at all — every Pydantic field has a sensible default. Override via environment variables (COPERNICUS_MCP_LOG_LEVEL, COPERNICUS_MCP_CACHE_DIR, COPERNICUS_MCP_STATE_DB, plus COPERNICUS_MCP_<SECTION>__<FIELD> for nested fields), or with a YAML file at ~/.config/copernicus-mcp/config.yaml or ~/.copernicus-mcp.yaml.

State directories: ~/.cache/copernicus-mcp/ (downloaded files + .provenance.json sidecars), ~/.local/state/copernicus-mcp/state.db (SQLite cache index, workflow rows, persisted provenance).

Troubleshooting

  • AuthError on tool call → run copernicus-mcp status and check backends.cmems.configured. If false, your env vars are not visible to the running process (common Claude Desktop pitfall — restart the client after editing config) or the credentials file is missing/unreadable.
  • CoverageUnavailableError → bbox or time range is outside the dataset's actual extent. Use marine_describe_dataset to inspect coverage and narrow the request.
  • ValidationError with recovery_action="modify_request_parameters" → request was structurally invalid (e.g. inverted bbox, antimeridian-crossing bbox, naive datetime). The next_action_hint field tells you exactly how to fix it.
  • Subset hangs → set COPERNICUS_MCP_LOG_LEVEL=DEBUG and watch for retry messages. Reduce bbox or time range if request is genuinely large.

License

BSD 3-Clause. See LICENSE. Dependencies are EUPL-1.2 (copernicusmarine), Apache-2.0 (cdsapi, most everything else), MIT or BSD. The package does not depend on sentinelhub-py; when the Sentinel Hub backend lands in a later iteration, this section will document the relevant CC BY-NC restriction on its SDK.

Acknowledgements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

copernicus_mcp-0.3.1.tar.gz (236.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

copernicus_mcp-0.3.1-py3-none-any.whl (267.5 kB view details)

Uploaded Python 3

File details

Details for the file copernicus_mcp-0.3.1.tar.gz.

File metadata

  • Download URL: copernicus_mcp-0.3.1.tar.gz
  • Upload date:
  • Size: 236.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for copernicus_mcp-0.3.1.tar.gz
Algorithm Hash digest
SHA256 34a21380d6d4467b5c4d487249c31512feb594501e88b064c90081d6cd876d78
MD5 fabc5c324f7fb1629e816b65e99cca0e
BLAKE2b-256 a90d64cad0bf115b3d4c7a8084362c33781b71673523258d4dc1d98e8bfa17c6

See more details on using hashes here.

File details

Details for the file copernicus_mcp-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: copernicus_mcp-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 267.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for copernicus_mcp-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f20a60a39b7a6f7f1ec359902e0072de1c1ddc50880ef4f5f0f6a244f2908ec2
MD5 76dcf8b1ecd7b38c16fbc3ed5cf92f15
BLAKE2b-256 d3a54bc55050332d0a7d5752b960f5390701331f46dbd0b0aedaa58e5859d52b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page