Skip to main content

Downloader and data management tools for climate and ocean datasets.

Project description

H2MARE - Geospatial Processing for Climate and Ocean Data

Python

A Python pipeline for downloading and preprocessing multi-source oceanographic and atmospheric data into analysis-ready formats. H2MARE streamlines the acquisition and harmonization of data from major climate and ocean observation services, optimized for large-scale spatiotemporal analysis.

Features

  • Multi-source data integration: Download and process data from CMEMS, AVISO, and ERA5.
  • Variable grouping: Organize related variables using configurable keys.
  • Format conversion: Automated conversion from NetCDF/GRIB to optimized Zarr and Parquet format
  • Data compilation: Regrid and interpolate multi-resolution datasets to a common grid
  • Point and geometry extraction: Extract time series for specific locations or spatial features

Data Sources

H2MARE supports the following data providers API keys and authentication are required for each:

  • CMEMS - Copernicus Marine Service: Satellite and in-situ ocean observations
  • AVISO - Archiving, Validation and Interpretation of Satellite Oceanographic data
  • CDS-ERA5 - ERA5 hourly atmospheric reanalysis (1940-present)
    Hersbach, H., et al. (2023). DOI: 10.24381/cds.adbb2d47

Note: Refer to each provider's documentation for authentication setup before use.

Installation

Prerequisites

  • Python >= 3.9
  • uv — fast Python package and project manager
  • Sufficient disk space for downloaded datasets (varies by region and time range)

Install from source

git clone https://github.com/h2ugoparra/h2mare.git
cd h2mare
uv sync          # installs all dependencies into .venv

For development (includes pytest, black, isort):

uv sync --extra dev

Configuration

Create .env file with external storage path:

STORE_DIR=/path/to/your/storage

Key variables groups

Edit config.yaml to define variable groups and processing parameters.

Data Flow

  • Dowload - Raw NetCDF/GRIB files are fetched from configurated sources and saved at specified time resolution (monthly or yearly) as native-resolution Zarr files.
  • Compilation (h2mare/processing/compiler.py) - Preprocessed data is regridded to a defined spatial/temporal resolution and geographic extent (configured via 'h2ds' key in config.yaml)
  • Extraction (h2mare/processing/extractor.py) - Point (CSV files) or geometry (SHP files) data extraction from xarray datasets.

Quick Start

# Download and process a single variable for a specific date range
uv run h2mare run sst --start-date 2021-01-01 --end-date 2021-12-31

# Multiple variables at once (space-separated)
uv run h2mare run seapodym mld o2 chl

# Infer missing dates from the existing store and download what's new
uv run h2mare run sst

# Download only (skip Zarr conversion)
uv run h2mare run sst --no-process

# Validate configuration without downloading
uv run h2mare run sst --dry-run

# Process all configured variables
uv run h2mare run

Development

# Run the full test suite
uv run pytest tests/

# Run a single test file
uv run pytest tests/test_zarr_catalog.py -v

# Format code
uv run black h2mare/
uv run isort h2mare/

Built with

Library Role
xarray N-dimensional labelled arrays and NetCDF/Zarr I/O
zarr Chunked, compressed array storage
dask Parallel and out-of-core computation
polars Fast DataFrame engine for extracted time series
geopandas Geometry-based spatial extraction
copernicusmarine CMEMS dataset access
cdsapi ERA5 / CDS dataset access

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests on GitHub.

License

This project is licensed under the MIT License - see the LICENSE file for details.

AI Assistance

Parts of this codebase were developed with the help of Claude (Anthropic).

Acknowledgments

This project was developed under the framework of COSTA project. This project relies on data from Copernicus Marine Service, AVISO, Copernicus Climate Data Store, and NOAA NCEI. We gratefully acknowledge these organizations for providing open access to their datasets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

h2mare-0.1.0.tar.gz (116.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

h2mare-0.1.0-py3-none-any.whl (115.6 kB view details)

Uploaded Python 3

File details

Details for the file h2mare-0.1.0.tar.gz.

File metadata

  • Download URL: h2mare-0.1.0.tar.gz
  • Upload date:
  • Size: 116.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for h2mare-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c477175357539ff67852f83298316af613ce9732e6a86fd062704d90830fa923
MD5 82161f60e98a32bc52b6e30db88e80eb
BLAKE2b-256 f8b17f19772ab810a56b42578f06a5fea4cee3c4c59f0732df2bc756d62bc81f

See more details on using hashes here.

Provenance

The following attestation bundles were made for h2mare-0.1.0.tar.gz:

Publisher: release.yml on h2ugoparra/h2mare

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file h2mare-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: h2mare-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 115.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for h2mare-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 158282909d17616271e1fca405d68c76263fc0d58707998d77334846eaaa48b9
MD5 51551d1bfc69b16f4574ca8480ac6b99
BLAKE2b-256 fcfa2915ba8be9b1940b091e3f3d52f5ce3ce9cb20237bd76c1f0a9e0a5ae320

See more details on using hashes here.

Provenance

The following attestation bundles were made for h2mare-0.1.0-py3-none-any.whl:

Publisher: release.yml on h2ugoparra/h2mare

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page