Skip to main content

Downloader and data management tools for climate and ocean datasets.

Project description

H2MARE - Geospatial Processing for Climate and Ocean Data

Python PyPI

A Python pipeline for downloading and preprocessing multi-source oceanographic and atmospheric data into analysis-ready formats. H2MARE streamlines the acquisition and harmonization of data from major climate and ocean observation services, optimized for large-scale spatiotemporal analysis.

Features

  • Multi-source data integration: Download and process data from CMEMS, AVISO, and ERA5.
  • Variable grouping: Organize related variables using configurable keys.
  • Format conversion: Automated conversion from NetCDF/GRIB to optimized Zarr and Parquet format
  • Data compilation: Regrid and interpolate multi-resolution datasets to a common grid
  • Point and geometry extraction: Extract time series for specific locations or spatial features

Data Sources

H2MARE supports the following data providers API keys and authentication are required for each:

  • CMEMS - Copernicus Marine Service: Satellite and in-situ ocean observations
  • AVISO - Archiving, Validation and Interpretation of Satellite Oceanographic data
  • CDS-ERA5 - ERA5 hourly atmospheric reanalysis (1940-present)
    Hersbach, H., et al. (2023). DOI: 10.24381/cds.adbb2d47

Note: Refer to each provider's documentation for authentication setup before use.

Installation

Prerequisites

  • Python >= 3.11
  • uv — fast Python package and project manager
  • Sufficient disk space for downloaded datasets (varies by region and time range)

Install from PyPI

pip install h2mare
# or
uv add h2mare

Install from source

git clone https://github.com/h2ugoparra/h2mare.git
cd h2mare
uv sync

Configuration

H2MARE requires two configuration files in your working directory before first use.

1. config.yaml

Defines variables, dataset IDs, bounding boxes, and processing parameters. Copy the template from the repository as a starting point and edit it to match your needs.

2. .env

# Path to external or large-capacity storage for processed Zarr files
STORE_DIR=/path/to/your/storage

# CMEMS credentials (required for SST, SSH, MLD, CHL, O2, SEAPODYM)
CMEMS_USERNAME=your_username
CMEMS_PASSWORD=your_password

# AVISO credentials (required for FSLE, Eddies)
AVISO_USERNAME=your_username
AVISO_PASSWORD=your_password
AVISO_FTP_SERVER=ftp-access.aviso.altimetry.fr

ERA5 / CDS credentials are configured separately via the cdsapi client — see the CDS documentation for setup.

Note: Both files must be present in the directory where you run h2mare. You can also set the H2MARE_ROOT environment variable to point to a different directory containing them.

Key variables groups

Edit config.yaml to define variable groups and processing parameters.

Data Flow

  • Dowload - Raw NetCDF/GRIB files are fetched from configurated sources and saved at specified time resolution (monthly or yearly) as native-resolution Zarr files.
  • Compilation (h2mare/processing/compiler.py) - Preprocessed data is regridded to a defined spatial/temporal resolution and geographic extent (configured via 'h2ds' key in config.yaml)
  • Extraction (h2mare/processing/extractor.py) - Point (CSV files) or geometry (SHP files) data extraction from xarray datasets.

Quick Start

# Download and process a single variable for a specific date range
uv run h2mare run sst --start-date 2021-01-01 --end-date 2021-12-31

# Multiple variables at once (space-separated)
uv run h2mare run seapodym mld o2 chl

# Infer missing dates from the existing store and download what's new
uv run h2mare run sst

# Download only (skip Zarr conversion)
uv run h2mare run sst --no-process

# Validate configuration without downloading
uv run h2mare run sst --dry-run

# Process all configured variables
uv run h2mare run

Development

# Run the full test suite
uv run pytest tests/

# Run a single test file
uv run pytest tests/test_zarr_catalog.py -v

# Format code
uv run black h2mare/
uv run isort h2mare/

Built with

Library Role
xarray N-dimensional labelled arrays and NetCDF/Zarr I/O
zarr Chunked, compressed array storage
dask Parallel and out-of-core computation
polars Fast DataFrame engine for extracted time series
geopandas Geometry-based spatial extraction
copernicusmarine CMEMS dataset access
cdsapi ERA5 / CDS dataset access

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests on GitHub.

License

This project is licensed under the MIT License - see the LICENSE file for details.

AI Assistance

Parts of this codebase were developed with the help of Claude (Anthropic).

Acknowledgments

This project was developed under the framework of COSTA project. This project relies on data from Copernicus Marine Service, AVISO, Copernicus Climate Data Store, and NOAA NCEI. We gratefully acknowledge these organizations for providing open access to their datasets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

h2mare-0.1.1.tar.gz (116.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

h2mare-0.1.1-py3-none-any.whl (116.1 kB view details)

Uploaded Python 3

File details

Details for the file h2mare-0.1.1.tar.gz.

File metadata

  • Download URL: h2mare-0.1.1.tar.gz
  • Upload date:
  • Size: 116.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for h2mare-0.1.1.tar.gz
Algorithm Hash digest
SHA256 17b407e0f1faf9af38f6a5f69d0797a8dd154ae4fd92535cae37567cc34b860b
MD5 21f4125d4be680da43f68f88f32fef41
BLAKE2b-256 1952bd083df4b161f42bc996efc31cea3e100fbfff9e70a9f1abeceebca945b2

See more details on using hashes here.

Provenance

The following attestation bundles were made for h2mare-0.1.1.tar.gz:

Publisher: release.yml on h2ugoparra/h2mare

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file h2mare-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: h2mare-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 116.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for h2mare-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e6b585955700bf412fb051cdeb227675ce6b2ad85513637dc3de763c0a7f530
MD5 2dc1eddc0a67485b2cafea33f9fb44db
BLAKE2b-256 32076e724695b697648b6d7b7fa7cf928db9ccf47fd051aeeb7790233e8206c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for h2mare-0.1.1-py3-none-any.whl:

Publisher: release.yml on h2ugoparra/h2mare

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page