Skip to main content

PyTorch TIEOF: higher-order tensor (PARAFAC/HOOI/HOSVD) DINEOF reconstruction for missing satellite data

Project description

torch-ieof: torch-based implementation of tieof

This is a slimmed-down, PyTorch-backed rewrite of the original TIEOF package. All three of the paper's higher-order tensor decompositions — PARAFAC (CP-ALS), HOOI (iterative Tucker), and TruncHOSVD (closed-form Tucker) — are re-implemented in pure PyTorch with no tensorly, no oct2py, and no ray. A classical 2-D DINEOF estimator (iterative truncated SVD) is also included for comparison. Runs on numpy>=2 / scikit-learn>=1.5 / torch>=2.2 under Python 3.12+. The legacy GHER Fortran bridge, CLI scripts, and interpolator/ package were dropped (they live on the main_backup branch).

📖 Richer docs: open README.html for a tabbed reference that summarises the three decomposition methods and the DINEOF reconstruction loop as described in the paper.

Install

pip install -e .            # or:  uv pip install -e .
pip install -e ".[dev]"     # with pytest

Or with uv

uv add torch-ieof

Dependencies: numpy>=2, scipy>=1.13, scikit-learn>=1.5, torch>=2.2, tqdm, loguru.

Usage

Raw numpy tensor

import numpy as np
from torch_ieof import DINEOF3

shape = (n_lat, n_lon, n_time)
tensor = ...                # np.ndarray, NaN for missing values
model = DINEOF3(R=3, tensor_shape=shape)   # decomp_type="parafac" (default)
model._fit(tensor)
filled = model.reconstructed_tensor

Choosing a decomposition (decomp_type)

DINEOF3 exposes all three engines from the paper. Pick with decomp_type:

DINEOF3(R=3, tensor_shape=shape, decomp_type="parafac")      # CP-ALS (default)
DINEOF3(R=(4, 4, 6), tensor_shape=shape, decomp_type="hooi") # iterative Tucker
DINEOF3(R=4, tensor_shape=shape, decomp_type="trunchosvd")   # closed-form Tucker
  • parafac — CP/PARAFAC via alternating least squares; R is a single integer rank shared across all modes. Most parsimonious and interpretable.
  • hooi — Higher-Order Orthogonal Iteration (Tucker-ALS). R may be an int (broadcast to every mode) or a per-mode tuple (R_lat, R_lon, R_time).
  • trunchosvd — truncated HOSVD: the closed-form Tucker initialiser. Cheapest, but lower quality (no ALS refinement). R as for hooi.

In the paper the three variants perform within each other's confidence intervals — the gain over classical DINEOF comes from working in the full 3-D feature space, not from the specific decomposition.

After fitting, model.predict_rank(k) reconstructs the tensor using only the first k components (an int for PARAFAC, an int or per-mode tuple for Tucker).

xarray DataArray (single variable)

from torch_ieof import reconstruct_dataarray

rec, model = reconstruct_dataarray(
    sst_da, R=3,
    lat_dim="lat", lon_dim="lon", time_dim="time",
    mask=land_mask,            # optional xr.DataArray or ndarray (lat, lon) bool
    to_center=True, nitemax=80, toliter=1e-4,
)
# rec has the same dim order, coords, and attrs as sst_da.

xarray Dataset — multivariate joint reconstruction

When one variable has heavy cloud cover but a related variable (sharing temporal dynamics) is more complete, jointly reconstructing them couples the variables through a shared temporal factor (Alvera-Azcárate-style multivariate DINEOF). Cloud gaps in the sparse variable are constrained by simultaneous observations in the others.

from torch_ieof import reconstruct_dataset

ds_recon, model = reconstruct_dataset(
    ds,                        # xr.Dataset of vars sharing (lat, lon, time)
    R=2 * 3,                   # rank — typically larger than per-var rank
    variables=["sst", "chl"],  # which vars to couple (default: all)
    masks={"sst": land_mask},  # optional per-var masks
    nitemax=120, toliter=1e-6,
)

Each variable is z-scored before stacking along the latitude axis, fit with a single CP decomposition, then split back and denormalised. to_center=False is the default for the joint path (z-scoring handles centering).

Classical 2-D DINEOF (truncated-SVD baseline)

For comparison, DINEOF implements the original 2-D method (Beckers & Rixen 2003): restrict to valid ocean pixels, reshape to a (space, time) matrix, and alternate truncated-SVD reconstruction with re-imputation of the gaps.

from torch_ieof import DINEOF

model = DINEOF(R=5, tensor_shape=shape, mask=ocean_mask)
model._fit(tensor)               # same NaN-for-missing convention as DINEOF3
filled = model.reconstructed_tensor

Same sklearn-style fit/predict/score API and predict_rank(k) helper as DINEOF3.

sklearn-style fit/predict

# X: (N, 3) integer coords (lat_idx, lon_idx, t_idx)
# y: (N,) values (NaN allowed for missing points)
model = DINEOF3(R=3, tensor_shape=(n_lat, n_lon, n_time))
model.fit(X, y)
y_pred = model.predict(X_test)
score = model.score(X_test, y_test)   # negative NRMSE

Key options:

  • R — rank. Int (CP rank, or broadcast Tucker rank) or per-mode tuple for Tucker variants.
  • decomp_type"parafac" (default), "hooi", or "trunchosvd".
  • mask(n_lat, n_lon) boolean array (or .npy path). True = inside the investigated area. Cells outside are zeroed during fitting and set to NaN in the output.
  • to_center / lat_lon_sep_centering — centre the tensor before fitting.
  • keep_non_negative_only — clamp negative reconstructed values to 0.
  • early_stopping — stop on absolute or gradient convergence (the paper's early-stopping mode); else stop on absolute error only.
  • nitemax, toliter — outer reconstruction loop budget / tolerance.
  • td_iter_max, tol — inner decomposition (CP-ALS / HOOI) budget / tolerance.

Performance, logging & progress

The hot loop runs in PyTorch. By default the device is cuda if available, otherwise CPU torch (MPS is skipped automatically because per-iter CPU fallbacks for SVD / pinv make it slower than CPU for typical CP ranks). Override with device="mps" for very large tensors, or dtype=torch.float64 for tighter numerics. Default dtype is float32.

Each fit logs a single INFO line at the start (shape, R, device, dtype, missing fraction) and one at the end (iters, final error). Iteration progress is shown via a live tqdm bar with err and Δerr in the postfix. Disable the bar with progress=False if you're piping output to a file or running in CI:

DINEOF3(R=4, tensor_shape=shape, progress=False)
reconstruct_dataarray(da, R=4, progress=False)

To suppress the INFO logs too:

from loguru import logger
logger.disable("torch_ieof")

Tests

pytest -q

Covers Kolda-convention unfolding, Khatri-Rao, CP reconstruction, end-to-end recovery of a noisy synthetic rank-R tensor, and a cloud-like patchy-mask case with a moving spatial feature.

Examples

Install the extra dependencies (matplotlib, xarray) and run:

pip install -e ".[examples]"
python examples/timeseries_demo.py   # SST-like field, cloud-blob gaps, saves PNG
python examples/xarray_workflow.py   # round-trip via xarray.DataArray

examples/xarray_workflow.py includes a small reconstruct_dataarray() helper showing how to wrap DINEOF3 for an xarray.DataArray with arbitrary dim order — useful as a template for plugging into an existing oceanographic workflow.

Credit, citing & license

The TIEOF method and its original implementation are the work of Kulikov, Inkova, Cherniuk, Teslyuk & Namsaraev. This package is a PyTorch repackaging; all scientific credit belongs to them. If you use this in academic work, please cite the original paper:

Kulikov, L.; Inkova, N.; Cherniuk, D.; Teslyuk, A.; Namsaraev, Z. TIEOF: Algorithm for Recovery of Missing Multidimensional Satellite Data on Water Bodies Based on Higher-Order Tensor Decompositions. Water 2021, 13(18), 2578. https://doi.org/10.3390/w13182578

Original code: https://github.com/theleokul/tieof.

Licensed under CC BY 4.0 — free to use, share and adapt with attribution to the authors above. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torch_ieof-0.2.0.tar.gz (399.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

torch_ieof-0.2.0-py3-none-any.whl (24.1 kB view details)

Uploaded Python 3

File details

Details for the file torch_ieof-0.2.0.tar.gz.

File metadata

  • Download URL: torch_ieof-0.2.0.tar.gz
  • Upload date:
  • Size: 399.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.9 {"installer":{"name":"uv","version":"0.9.9"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for torch_ieof-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0b25daced3237be00506584c60af4c51c841da14b249c08039ccc790c961fd75
MD5 44919dfd51fa9556e5bb4695f55886f7
BLAKE2b-256 150e7882bca01be28cfecf4beb41e9b732df2102d63cc17c4ff2d402ffadd735

See more details on using hashes here.

File details

Details for the file torch_ieof-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: torch_ieof-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 24.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.9 {"installer":{"name":"uv","version":"0.9.9"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for torch_ieof-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e5f6339a7e51db6d8f18541e6eb3d2b0394acc0725fb80230fb228ef77343d7
MD5 15d0fdad0bb8afc7151815d2d0a50bd3
BLAKE2b-256 9d1065c3d433a7a4107913cbc3ed03d532f5312a10fb30bb522d3064bc8da13a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page