Skip to main content

PyTorch TIEOF: higher-order tensor (PARAFAC/HOOI/HOSVD) DINEOF reconstruction for missing satellite data

Project description

torch-ieof: torch-based implementation of tieof

This is a slimmed-down, PyTorch-backed rewrite of the original TIEOF package. All three of the paper's higher-order tensor decompositions — PARAFAC (CP-ALS), HOOI (iterative Tucker), and TruncHOSVD (closed-form Tucker) — are re-implemented in pure PyTorch with no tensorly, no oct2py, and no ray. A classical 2-D DINEOF estimator (iterative truncated SVD) is also included for comparison. Runs on numpy>=2 / scikit-learn>=1.5 / torch>=2.2 under Python 3.12+. The legacy GHER Fortran bridge, CLI scripts, and interpolator/ package were dropped (they live on the main_backup branch).

📖 Richer docs: open README.html for a tabbed reference that summarises the three decomposition methods and the DINEOF reconstruction loop as described in the paper.

Install

pip install -e .            # or:  uv pip install -e .
pip install -e ".[dev]"     # with pytest

Or with uv

uv add torch-ieof

Dependencies: numpy>=2, scipy>=1.13, scikit-learn>=1.5, torch>=2.2, tqdm, loguru.

Usage

Raw numpy tensor

import numpy as np
from torch_ieof import DINEOF3

shape = (n_lat, n_lon, n_time)
tensor = ...                # np.ndarray, NaN for missing values
model = DINEOF3(R=3, tensor_shape=shape)   # decomp_type="parafac" (default)
model._fit(tensor)
filled = model.reconstructed_tensor

Choosing a decomposition (decomp_type)

DINEOF3 exposes all three engines from the paper. Pick with decomp_type:

DINEOF3(R=3, tensor_shape=shape, decomp_type="parafac")      # CP-ALS (default)
DINEOF3(R=(4, 4, 6), tensor_shape=shape, decomp_type="hooi") # iterative Tucker
DINEOF3(R=4, tensor_shape=shape, decomp_type="trunchosvd")   # closed-form Tucker
  • parafac — CP/PARAFAC via alternating least squares; R is a single integer rank shared across all modes. Most parsimonious and interpretable.
  • hooi — Higher-Order Orthogonal Iteration (Tucker-ALS). R may be an int (broadcast to every mode) or a per-mode tuple (R_lat, R_lon, R_time).
  • trunchosvd — truncated HOSVD: the closed-form Tucker initialiser. Cheapest, but lower quality (no ALS refinement). R as for hooi.

In the paper the three variants perform within each other's confidence intervals — the gain over classical DINEOF comes from working in the full 3-D feature space, not from the specific decomposition.

After fitting, model.predict_rank(k) reconstructs the tensor using only the first k components (an int for PARAFAC, an int or per-mode tuple for Tucker).

xarray DataArray (single variable)

from torch_ieof import reconstruct_dataarray

rec, model = reconstruct_dataarray(
    sst_da, R=3,
    lat_dim="lat", lon_dim="lon", time_dim="time",
    mask=land_mask,            # optional xr.DataArray or ndarray (lat, lon) bool
    to_center=True, nitemax=80, toliter=1e-4,
)
# rec has the same dim order, coords, and attrs as sst_da.

xarray Dataset — multivariate joint reconstruction

When one variable has heavy cloud cover but a related variable (sharing temporal dynamics) is more complete, jointly reconstructing them couples the variables through a shared temporal factor (Alvera-Azcárate-style multivariate DINEOF). Cloud gaps in the sparse variable are constrained by simultaneous observations in the others.

from torch_ieof import reconstruct_dataset

ds_recon, model = reconstruct_dataset(
    ds,                        # xr.Dataset of vars sharing (lat, lon, time)
    R=2 * 3,                   # rank — typically larger than per-var rank
    variables=["sst", "chl"],  # which vars to couple (default: all)
    masks={"sst": land_mask},  # optional per-var masks
    nitemax=120, toliter=1e-6,
)

Each variable is z-scored before stacking along the latitude axis, fit with a single CP decomposition, then split back and denormalised. to_center=False is the default for the joint path (z-scoring handles centering).

Classical 2-D DINEOF (truncated-SVD baseline)

For comparison, DINEOF implements the original 2-D method (Beckers & Rixen 2003): restrict to valid ocean pixels, reshape to a (space, time) matrix, and alternate truncated-SVD reconstruction with re-imputation of the gaps.

from torch_ieof import DINEOF

model = DINEOF(R=5, tensor_shape=shape, mask=ocean_mask)
model._fit(tensor)               # same NaN-for-missing convention as DINEOF3
filled = model.reconstructed_tensor

Same sklearn-style fit/predict/score API and predict_rank(k) helper as DINEOF3.

sklearn-style fit/predict

# X: (N, 3) integer coords (lat_idx, lon_idx, t_idx)
# y: (N,) values (NaN allowed for missing points)
model = DINEOF3(R=3, tensor_shape=(n_lat, n_lon, n_time))
model.fit(X, y)
y_pred = model.predict(X_test)
score = model.score(X_test, y_test)   # negative NRMSE

Key options:

  • R — rank. Int (CP rank, or broadcast Tucker rank) or per-mode tuple for Tucker variants.
  • decomp_type"parafac" (default), "hooi", or "trunchosvd".
  • mask(n_lat, n_lon) boolean array (or .npy path). True = inside the investigated area. Cells outside are zeroed during fitting and set to NaN in the output.
  • to_center / lat_lon_sep_centering — centre the tensor before fitting.
  • keep_non_negative_only — clamp negative reconstructed values to 0.
  • early_stopping — stop on absolute or gradient convergence (the paper's early-stopping mode); else stop on absolute error only.
  • nitemax, toliter — outer reconstruction loop budget / tolerance.
  • td_iter_max, tol — inner decomposition (CP-ALS / HOOI) budget / tolerance.

Performance, logging & progress

The hot loop runs in PyTorch. By default the device is cuda if available, otherwise CPU torch (MPS is skipped automatically because per-iter CPU fallbacks for SVD / pinv make it slower than CPU for typical CP ranks). Override with device="mps" for very large tensors, or dtype=torch.float64 for tighter numerics. Default dtype is float32.

Each fit logs a single INFO line at the start (shape, R, device, dtype, missing fraction) and one at the end (iters, final error). Iteration progress is shown via a live tqdm bar with err and Δerr in the postfix. Disable the bar with progress=False if you're piping output to a file or running in CI:

DINEOF3(R=4, tensor_shape=shape, progress=False)
reconstruct_dataarray(da, R=4, progress=False)

To suppress the INFO logs too:

from loguru import logger
logger.disable("torch_ieof")

Tests

pytest -q

Covers Kolda-convention unfolding, Khatri-Rao, CP reconstruction, end-to-end recovery of a noisy synthetic rank-R tensor, and a cloud-like patchy-mask case with a moving spatial feature.

Examples

Install the extra dependencies (matplotlib, xarray) and run:

pip install -e ".[examples]"
python examples/timeseries_demo.py   # SST-like field, cloud-blob gaps, saves PNG
python examples/xarray_workflow.py   # round-trip via xarray.DataArray

examples/xarray_workflow.py includes a small reconstruct_dataarray() helper showing how to wrap DINEOF3 for an xarray.DataArray with arbitrary dim order — useful as a template for plugging into an existing oceanographic workflow.

Credit, citing & license

The TIEOF method and its original implementation are the work of Kulikov, Inkova, Cherniuk, Teslyuk & Namsaraev. This package is a PyTorch repackaging; all scientific credit belongs to them. If you use this in academic work, please cite the original paper:

Kulikov, L.; Inkova, N.; Cherniuk, D.; Teslyuk, A.; Namsaraev, Z. TIEOF: Algorithm for Recovery of Missing Multidimensional Satellite Data on Water Bodies Based on Higher-Order Tensor Decompositions. Water 2021, 13(18), 2578. https://doi.org/10.3390/w13182578

Original code: https://github.com/theleokul/tieof.

Licensed under CC BY 4.0 — free to use, share and adapt with attribution to the authors above. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torch_ieof-0.3.0.tar.gz (402.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

torch_ieof-0.3.0-py3-none-any.whl (25.7 kB view details)

Uploaded Python 3

File details

Details for the file torch_ieof-0.3.0.tar.gz.

File metadata

  • Download URL: torch_ieof-0.3.0.tar.gz
  • Upload date:
  • Size: 402.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.9 {"installer":{"name":"uv","version":"0.9.9"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for torch_ieof-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0727fe6111075b34c09609b2f2111e64725d5147a267fe4fffc36c175fc347af
MD5 127aae84c8c787e6d4be7367aa3c0239
BLAKE2b-256 7efb5d031a032c7257aaddc37720146d267e934907daa6e92b6fd4bc0f935a18

See more details on using hashes here.

File details

Details for the file torch_ieof-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: torch_ieof-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 25.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.9 {"installer":{"name":"uv","version":"0.9.9"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for torch_ieof-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4ddd3435f2836c673149704cf59619988da24d747f992d165e0552c6d76acf04
MD5 5be8fc91b18727f623b0b42b571fc9db
BLAKE2b-256 bbe31c8eed6dff73525f8d7d74298109b9a2bd1b8cff43a878bef4c68ba747a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page