Skip to main content

NASA PDS Planetary Plasma Interactions data access — browse missions, inspect parameters, fetch PDS data

Project description

xhelio-pds

NASA PDS Planetary Plasma Interactions data access — browse missions, inspect parameters, fetch PDS data.

Works as a standalone Python library or as an MCP server for any MCP-compatible LLM client (Claude Desktop, Cursor, custom agents).

What's included

  • 17 mission catalogs with 1200+ datasets — Juno, Cassini, Voyager 1/2, MAVEN, Galileo, New Horizons, and more
  • PDS3 + PDS4 support — fixed-width ASCII tables with ODL (regex) and XML label parsing
  • Automatic schema validation — labels are compared across files within each dataset to detect schema drift (field changes, unit changes, missing columns)
  • Structured system prompts per mission — give an LLM full context about available instruments, datasets, and time coverage

Installation

# Library only
pip install xhelio-pds

# With MCP server
pip install xhelio-pds[mcp]

MCP Server

Configuration (Claude Desktop, Cursor, etc.)

{
  "mcpServers": {
    "pds": {
      "command": "xhelio-pds-mcp"
    }
  }
}

With custom cache directory:

{
  "mcpServers": {
    "pds": {
      "command": "xhelio-pds-mcp",
      "args": ["--cache-dir", "/path/to/cache"]
    }
  }
}

Or run directly:

xhelio-pds-mcp
xhelio-pds-mcp --cache-dir /path/to/cache
python -m pdsmcp

Cache directory

All runtime data is stored under a single root directory. Defaults to ~/.pdsmcp/.

Configure via --cache-dir (MCP server) or pdsmcp.configure() (library):

import pdsmcp
pdsmcp.configure(cache_dir="/path/to/cache")
~/.pdsmcp/                     # or custom path via configure()
├── metadata/                  # PDS label-derived parameter metadata
├── data_cache/                # Downloaded PDS data + label files (permanent, reused across fetches)
│   └── jno/fgm/               #   organized by mission/instrument path
│       ├── FGM_JNO_L3_2024001SE_V01.STS
│       └── FGM_JNO_L3_2024001SE_V01.LBL
└── validation/                # Schema consistency records (append-only)
    └── pds3_JNO-J-3-FGM-CAL-V1.0_DATA.json
  • metadata/ — Parameter metadata parsed from PDS labels. Built lazily on first access per dataset.
  • data_cache/ — Permanent cache of downloaded PDS data and label files. Once downloaded, never re-downloaded. Use manage_cache(action="clean", category="data_cache") to free disk space.
  • validation/ — Schema drift records from comparing labels across files within a dataset. Append-only, one JSON per dataset.

Tools

Tool Description
browse_missions() List all 17 PDS PPI missions with descriptions, dataset counts, and instruments
load_mission(mission_id) Get the complete system prompt for a mission (role instructions + full dataset catalog)
browse_parameters(dataset_id) Browse all variables in a dataset — name, type, units, description, plus schema validation summary
fetch_data(dataset_id, parameters, start, stop, output_dir) Download PDS data, write to file, return metadata + per-column stats (min, max, mean, std, nan_ratio)
manage_cache(action, ...) Cache management — status, clean, refresh metadata, refresh time ranges, rebuild catalog

Typical workflow

browse_missions  →  load_mission("juno")  →  browse_parameters("pds3:JNO-J-3-FGM-CAL-V1.0:DATA")  →  fetch_data(...)
  1. Discover available missions
  2. Load a mission's full catalog and instructions
  3. Inspect dataset parameters to choose what to fetch
  4. Fetch data for a time range — returns file path + statistics

Python Library

from pdsmcp.catalog import browse_missions
from pdsmcp.prompts import build_mission_prompt
from pdsmcp.metadata import browse_parameters
from pdsmcp.fetch import fetch_data

# List all 17 PDS PPI missions
missions = browse_missions()

# Get mission-specific system prompt
prompt = build_mission_prompt("juno")

# Browse dataset parameters (fetches label on first access, cached after)
params = browse_parameters(dataset_id="pds3:JNO-J-3-FGM-CAL-V1.0:DATA")

# Fetch data — returns DataFrames directly
result = fetch_data(
    "pds3:JNO-J-3-FGM-CAL-V1.0:DATA",
    ["BX PLANETOCENTRIC", "BY PLANETOCENTRIC"],
    "2024-01-01", "2024-01-02",
)
bx = result["BX PLANETOCENTRIC"]
print(bx["data"])       # pandas DataFrame
print(bx["units"])      # "NT"
print(bx["stats"])      # per-column {min, max, mean, std, nan_ratio}

Schema validation

When fetch_data downloads PDS files, it automatically compares each file's label against the reference schema (captured from the first file seen). Discrepancies are recorded in ~/.pdsmcp/validation/ and surfaced through browse_parameters:

  • Missing fields — present in the reference label but absent from a later file
  • New fields — present in a later file but not in the reference label
  • Metadata drift — same field name but different units, type, or size across files

This validation runs on every file during fetch (deduplicated by URL) and builds an append-only archive with full provenance.

Batch validation without fetching full data:

python -m pdsmcp.scripts.validate_schema --mission juno
python -m pdsmcp.scripts.validate_schema --dataset-id "pds3:JNO-J-3-FGM-CAL-V1.0:DATA" --sample 20

Bundled data

Data Count Description
Mission catalogs 17 Instruments, datasets, time coverage
Prompt templates 2 Generic role + PDS-specific workflow instructions

All bundled data ships with the package. No network access needed for browsing — only fetch_data and browse_parameters (first access) require a connection to PDS.

Catalog updates

Rebuild from PDS PPI Metadex API:

# Rebuild mission catalogs
python -m pdsmcp.scripts.build_catalog
python -m pdsmcp.scripts.build_catalog --mission juno
python -m pdsmcp.scripts.build_catalog --list

Development

pip install -e ".[dev]"
pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xhelio_pds-0.2.1.tar.gz (106.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xhelio_pds-0.2.1-py3-none-any.whl (99.1 kB view details)

Uploaded Python 3

File details

Details for the file xhelio_pds-0.2.1.tar.gz.

File metadata

  • Download URL: xhelio_pds-0.2.1.tar.gz
  • Upload date:
  • Size: 106.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for xhelio_pds-0.2.1.tar.gz
Algorithm Hash digest
SHA256 a997edf25e4842dc760c5b171fe2e47ad7823b73c6c7cc059a20314debe20bd3
MD5 819c6b6fc2121ee77f6be5c13140c20e
BLAKE2b-256 70af62376d511a97271009ec35821f64da915e318847a16ddeb3cdc1bd3e92a4

See more details on using hashes here.

File details

Details for the file xhelio_pds-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: xhelio_pds-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 99.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for xhelio_pds-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0ca74ea7038f04b2f34634ca7a3948fc116ab625f685a1fc0246406f6ef532ec
MD5 a2c79d4c6ca7dfb7482643eafaa7b331
BLAKE2b-256 998dd2f5e9db1a2658d6b6c8c6746c4d260f56c2f6a54c0a6b5ff18a16cf54ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page