Skip to main content

Causal Time Series Modeling of Supraglacial Lake Evolution in Greenland under Distribution Shift

Project description

RIC-TSC: Causal Time Series Modeling of Supraglacial Lake Evolution in Greenland under Distribution Shift

This repository provides the implementation for "Causal Time Series Modeling of Supraglacial Lake Evolution in Greenland under Distribution Shift", an accepted paper to ICMLA 2025. We introduce a regionally-informed causal framework that discovers lagged environmental drivers of supraglacial lake (SGL) evolution across Greenland and uses these causal signals for robust sequence modeling under spatial distribution shift.


Introduction

Supraglacial lakes (SGLs) exhibit complex spatiotemporal behaviors such as rapid drainage, slow drainage, refreezing, and burial. Accurate classification of lake evolution is critical to understanding meltwater runoff and ice sheet stability.

This repository presents a causally-informed modeling framework that identifies invariant environmental drivers across Greenland using Joint PCMCI+ (J-PCMCI+), and also captures region-specific causal mechanisms in individual basins. These causal predictors are then used in downstream sequence modeling to improve robustness and generalization under distribution shifts. We assess performance in global, in-distribution (ID), and out-of-distribution (OOD) settings.


Methodology

We construct daily multivariate time series from satellite and reanalysis sources:

  • Sentinel-1 SAR (HV backscatter anomaly)
  • Sentinel-2 and Landsat-8 optical imagery (NDWI-based water fraction, solar zenith)
  • CARRA-West reanalysis (temperature, humidity, pressure, SST, etc.)

J-PCMCI+ is applied globally and per region to identify lagged causal parents of HV_anom (horizontally transmitted, vertically received backscatter anomaly), a proxy for lake water presence. These causal features are then used for lake evolution classification.

RIC-TSC Methodology


Installation

Install the package in editable mode for development:

git clone [https://github.com/ehfahad/RIC-TSC.git](https://github.com/ehfahad/RIC-TSC.git)
cd RIC-TSC
pip install -e .

Directory Structure

RIC-TSC/
├── src/rictsc/                        # Core package logic   ├── utils/                         # Refactored helper functions   ├── preprocessing.py               # Preprocessing module   ├── causality.py                   # Causal feature module   └── classification.py              # RICTSCClassifier API
├── causality/                         # J-PCMCI+ causal discovery notebooks
├── data/                              # Raw, processed, and causal datasets
├── figures/                           # Methodology diagrams and experiment visualizations
├── results/                           # Output metrics, confusion matrices, GMM plots
├── tests/                             # Package sanity tests
├── pyproject.toml                     # Package metadata and dependencies
├── run_global_classification.py       # Global pooled classification script
└── run_regionwise_classification.py   # Region-wise ID and OOD classification script

Quickstart

1. Command Line Interface

Run the pipeline directly from the terminal using the installed entry points:

# Step 1: Preprocess time series for all lakes
rictsc-preprocess

# Step 2: Extract region-specific causal datasets
rictsc-causal

2. Python API

Integrate the RIC-TSC classifier into your own scripts:

from rictsc import RICTSCClassifier
import pandas as pd

# Initialize the classifier
model = RICTSCClassifier(seed=42)

# Load data and fit model on causal features
df = pd.read_csv("data/region_causal_datasets/CW_causal_timeseries.csv")
model.fit(df, feature_cols=["HV_anom_lag1", "S2_water", "r2"], label_col="label")

# Predict on new sequences
predictions = model.predict(test_df, feature_cols=["HV_anom_lag1", "S2_water", "r2"])

Output Structure

results/
├── global_classification/
│   └── global_classification_results.csv  # Metrics for global experiment comparing causal vs. baseline models
│
├── region_specific_classification/
│   ├── id_results.csv                     # Region-wise ID results comparing causal vs. baseline models   └── ood_results.csv                    # OOD results where models are trained on one region and tested on the other five

Experiments

We evaluate RIC-TSC under three experimental settings:

  • Global: Train/test on pooled lake data from all six regions using an 80/20 split stratified by region.
  • In-Distribution (ID): For each region, an 80/20 train/test split is applied to that region’s lakes.
  • Out-of-Distribution (OOD): Train on a single region and test on the remaining five, assessing generalization beyond the training domain.

Each setting compares two models:

  • Causal Model: Trained only on the lagged causal parents discovered by J-PCMCI+ for each region.
  • Baseline Model: Trained using all available features, with no causal feature selection or temporal lag filtering.

Performance is reported using overall accuracy, macro-averaged F1, precision, and recall.


Citation

This work is under submission. Please cite as:

@misc{hossain2025rictsc,
  title={Causal Time Series Modeling of Supraglacial Lake Evolution in Greenland under Distribution Shift},
  author={Emam Hossain and Muhammad Hasan Ferdous and Devon Dunmire and Aneesh Subramanian and Md Osman Gani},
  year={2025},
  note={Accepted for publication in 2025 International Conference on Machine Learning and Applications (ICMLA)}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rictsc-0.1.1.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rictsc-0.1.1-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file rictsc-0.1.1.tar.gz.

File metadata

  • Download URL: rictsc-0.1.1.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for rictsc-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c9e568bfa3fdb79d9de13bc68dd1f9af16e757f7302e97ae37e07e723739cf73
MD5 01fe75a232e8853b9604850e855d1b13
BLAKE2b-256 716ebed571c48d250f4adadb0b168c6a7ef612d3368aa2cfc7ad5e0d9906c2de

See more details on using hashes here.

File details

Details for the file rictsc-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: rictsc-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for rictsc-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c452838faf7a7bb36c0593a56e52fd9f9942e2caf32c33533244dd9e6b1439d2
MD5 c29e08edabafaf0119bcabfdc1da5e98
BLAKE2b-256 5b26662a8b9eb1d80c313636388703741fa06ba7ba93ea44c4b0610f1b21ed91

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page