Skip to main content

DepthDif public inference helpers for sparse ocean temperature diffusion.

Project description

Open Documentation Check Experiments

DepthDif

DepthDif is a conditional diffusion project for densifying sparse ocean temperature observations. Visit the Documentation for more info on the models, datasets, and auxiliary data - or follow along with the Experiments.

Installation

This project uses Python 3.12.3.

python -m pip install -r requirements.txt

For public inference usage, the package can also be installed in editable mode:

python -m pip install -e .

PyPI releases are published by GitHub Actions when a version tag such as v0.1.0 is pushed on main. The tag must match project.version in pyproject.toml, and the PyPI project must be configured for trusted publishing from the repository's pypi environment.

Model Overview

  • Model: PixelDiffusionConditional (conditional pixel-space diffusion with ConvNeXt U-Net denoiser).
  • Active dataset: data/dataset_argo_netcdf_gridded.py (ArgoNetCDFGriddedPatchDataset) lazily builds model-ready patches from ARGO/EN4, GLORYS, OSTIA, and sea-level NetCDF files without writing patch exports.
  • Optional dataset ablation: dataset.synthetic.enabled=true builds sparse x from random GLORYS y pixels, controlled by dataset.synthetic.pixel_count.
  • Config layout:
    • configs/px_space/: active pixel-space diffusion configs
    • configs/lat_space/: latent-space model/training/autoencoder configs

DepthDif is a conditional diffusion model: it reconstructs dense GLORYS depth fields from sparse ARGO profile observations, conditioned on OSTIA surface SST plus coordinate/date context.

Ambient-occlusion training is available via model.ambient_occlusion.*: the model receives a further-corrupted sparse Argo input during training while loss is evaluated on the original x support intersected with valid y support (x_valid_mask ∩ y_valid_mask). With the current x0 training preset, the model predicts the clean target on that masked support rather than the old missing-pixel region. At inference time, both standard and ambient outputs are masked back to NaN wherever y_valid_mask==0; ambient mode does not do a post-hoc overwrite with observed x values when clamp_known_pixels=false. See docs/ambient-occlusion-objective.md for the full mathematical objective, figure walkthrough, and citation. depthdif_schema

Training

OSTIA + Argo NetCDF training:

/work/envs/depth/bin/python train.py \
  --data-config configs/px_space/data_ostia_argo_netcdf.yaml \
  --train-config configs/px_space/training_config.yaml \
  --model-config configs/px_space/model_config.yaml

Ambient-occlusion objective example:

/work/envs/depth/bin/python train.py \
  --data-config configs/px_space/data_ostia_argo_netcdf.yaml \
  --train-config configs/px_space/training_config.yaml \
  --model-config configs/px_space/model_config_ambient.yaml \
  --set training.wandb.run_name=ambient_ostia_argo_netcdf_v1

Notes:

  • --train-config and --training-config are equivalent.
  • Training outputs are written under logs/<timestamp>/ with best.ckpt and last.ckpt.
  • model.resume_checkpoint resumes full Lightning state; model.load_checkpoint warm-starts by loading only model weights.
  • Latent diffusion workflow configs live in configs/lat_space/; see docs/autoencoder.md for AE + latent setup and launch commands.
  • Latent launcher scripts: scripts/train_autoencoder.sh, scripts/train_latent_diffusion.sh.

Inference

Public ISO-week inference API:

from depth_recon import run_week_inference

run_dir = run_week_inference(
    year=2015,
    iso_week=25,
    rectangle=(-20.0, 30.0, 10.0, 50.0),
    device="cuda",
    config_repo="simon-donike/DepthDif",
)

The public API downloads configs/checkpoints and the land mask from Hugging Face, downloads EN4/ARGO and, by default, OSTIA for the selected ISO week, and returns the GeoTIFF run directory. Pass auto_download_ostia=False without ostia_dir to run ARGO-only inference. GLORYS is not required for the standard public inference path; it is only needed for training or optional ground-truth comparison exports. OSTIA downloads use the Copernicus Marine CLI credentials configured in the environment, or credentials passed to run_week_inference via copernicus_username plus copernicus_token. The Copernicus Marine toolbox accepts that token through its password field, so copernicus_password remains supported as a backwards-compatible alias.

By default, the package uses simon-donike/DepthDif at revision main, model_config.yaml, and depthdif_v1.ckpt.

To fetch source files separately:

depth-recon-download-argo --year 2015 --iso-week 25 --output-dir ./en4_profiles
depth-recon-download-ostia --year 2015 --iso-week 25 --output-dir ./ostia

Use inference/run_single.py:

  1. Set config/checkpoint constants at the top of inference/run_single.py (MODEL_CONFIG_PATH, DATA_CONFIG_PATH, TRAIN_CONFIG_PATH, CHECKPOINT_PATH). For the active EO setup in this repository, use: configs/px_space/model_config.yaml, configs/px_space/data_ostia_argo_netcdf.yaml, configs/px_space/training_config.yaml
  2. Choose MODE ("dataloader" or "random").
  3. Run:
/work/envs/depth/bin/python inference/run_single.py

For a full spatial export, use inference/export_global.py. It selects one exact daily snapshot from the configured patch dataset (directly or via ISO week/year), runs inference on every patch for that day, streams the accumulation to disk, and writes stitched prediction and GLORYS GeoTIFFs for Surface, 10m, 50m, 100m, 250m, 500m, 1000m, 2000m, 2500m, and 5000m under inference/outputs/global_top_band_<YYYYMMDD>/. Requested depths are mapped to the nearest GLORYS channel and each TIFF records both the requested and actual source depth in metadata. By default it also writes GeoJSON exports for observed Argo point locations, sampled full-profile locations with per-point graphs, and train/val patch squares. Pass --prediction-ensemble-runs 5 to average five stochastic predictions per patch before writing the GeoTIFFs consumed by the globe packager; the default 1 keeps the existing single-run behavior.

For a pooled validation-set depth summary, use inference/export_validation_error_summary.py. It loads the configured dataset val split, runs inference across the whole split, computes per-depth median absolute error against both GLORYS and the observed ARGO values, writes validation_error_by_depth.csv, and saves both a single-panel error graph and a two-panel median-profile/error figure under inference/outputs/validation_error_summary/ by default.

/work/envs/depth/bin/python inference/export_validation_error_summary.py \
  --data-config configs/px_space/data_ostia_argo_netcdf.yaml \
  --checkpoint logs/<run>/best.ckpt \
  --split val \
  --year 2015 \
  --iso-week 25 \
  --device cuda

To package one exported run for the Cesium globe viewer in the docs, use:

/work/envs/depth/bin/python inference/export_cesium_globe_assets.py \
  --run-dir inference/outputs/global_top_band_<YYYYMMDD> \
  --public-base-url https://<bucket-or-site>/inference_production/globe/ \
  --rclone-remote r2:<bucket>/inference_production/globe \
  --rclone-sync-scope globe

The globe packager tiles every exported depth level into Cesium-ready folders and uploads those tiled assets, GeoJSON, graph PNGs, and globe-config.json when --rclone-sync-scope globe is used. Raw GeoTIFFs remain local in the run directory. The standalone viewer page lives at docs/globe/index.html and can load a hosted globe-config.json.

Experiment Script

Use experiments.py for quick qualitative ablations on a single dataloader sample. It loads the configured model and checkpoint, runs a few fixed conditioning cases (eo_plus_x, x_only_no_eo, coords_date_only_no_eo_no_x), saves comparison plots under temp/images/, and prints compact tensor statistics for each case.

Typical run:

/work/envs/depth/bin/python experiments.py

Before running, check the config and checkpoint constants at the top of experiments.py if you want a different model, dataset split, or checkpoint.

Documentation

  • Full documentation: docs/ (or build/serve with MkDocs).
  • Autoencoder + latent workflow guide: docs/autoencoder.md.
  • Experiments page: docs/experiments.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

depth_recon-0.0.1a6.tar.gz (279.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

depth_recon-0.0.1a6-py3-none-any.whl (271.8 kB view details)

Uploaded Python 3

File details

Details for the file depth_recon-0.0.1a6.tar.gz.

File metadata

  • Download URL: depth_recon-0.0.1a6.tar.gz
  • Upload date:
  • Size: 279.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for depth_recon-0.0.1a6.tar.gz
Algorithm Hash digest
SHA256 5ec374d5b630091a59d8edc4077de4a1946e3fad05d45bf586fd5e42bd0ddbc5
MD5 25982dbe98c042b40d514767c7e2bfa8
BLAKE2b-256 b7b29951cd2c6e8603c47392f9bf767a12c78e65686a72ad09ba80dd11194ebb

See more details on using hashes here.

File details

Details for the file depth_recon-0.0.1a6-py3-none-any.whl.

File metadata

  • Download URL: depth_recon-0.0.1a6-py3-none-any.whl
  • Upload date:
  • Size: 271.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for depth_recon-0.0.1a6-py3-none-any.whl
Algorithm Hash digest
SHA256 93bfea53adb36161bbc81156c95570c29022711b74ad21ffe9f4be2b25a2169e
MD5 5ab2086c7fed9a2dbd6e5329579273c4
BLAKE2b-256 a9389138b930fc664baecc71161afb7ebcc1d361aee99e2a498c0c24cb50fe15

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page