Skip to main content

A model-agnostic framework for gradient-based data refinement.

Project description

DenoGrad: A Model-Agnostic Framework for Gradient-Based Data Refinement

PyPI version License: AGPL v3 Python >= 3.6

DenoGrad is a novel, model-agnostic framework for gradient-based data refinement that leverages the representational knowledge and spectral bias of deep neural networks to correct corrupted observations. It operates within the Data-Centric AI paradigm, where the focus shifts from improving models to improving data.

Unlike supervised denoising approaches that require clean ground truth, DenoGrad performs input optimization: it freezes the weights of a pre-trained backbone model and iteratively backpropagates error corrections directly into the input space, guiding noisy samples toward regions consistent with the learned data manifold.

Paper: DenoGrad: A Model-Agnostic Framework for Gradient-Based Data Refinement J. Javier Alonso-Ramos, Ignacio Aguilera-Martos, Andrés Herrera-Poyatos, Francisco Herrera University of Granada & DaSCI Institute

Key Features

  • Model-Agnostic: Works with any differentiable PyTorch backbone (MLP, LSTM, xLSTM, CNN, CNN-LSTM, Transformers, TabPFN, DLinear, etc.).
  • No Clean Ground Truth Required: Self-supervised input optimization on the noisy dataset itself.
  • Dual Domain Support: Specialized handling for both Static Tabular data and Time-Series forecasting (via a Consensus Strategy).
  • Joint Feature-Target Optimization: Simultaneously refines input features $X$ and continuous targets $Y$ using jointly normalized gradients.
  • Manifold Preservation: Achieves state-of-the-art error reduction while maintaining the highest structural fidelity, evidenced by minimal Sliced Wasserstein Distance (SWD) and maximal feature correlation consistency ($\bar{\rho}$).
  • Dataset-Level Regularizer: Yields predictive improvements even on nominally clean datasets by mitigating latent aleatory noise.

Installation

DenoGrad is available on PyPI:

pip install denograd

Or install the latest version from source:

git clone https://github.com/ari-dasci/S-noise-gradient.git
cd S-noise-gradient
pip install .

Requirements: Python >= 3.6, PyTorch, NumPy, tqdm


Quick Start

DenoGrad integrates seamlessly into existing PyTorch pipelines. You need your (noisy) data and a model that has been trained on it.

Static Tabular Data

import torch
import torch.nn as nn
from denograd import DenoGrad

# 1. Define and train your model on the noisy data
model = nn.Sequential(
    nn.Linear(10, 64), nn.ReLU(),
    nn.Linear(64, 32), nn.ReLU(),
    nn.Linear(32, 1)
)
criterion = nn.MSELoss()
# ... train the model on X_noisy, y_noisy ...

# 2. Initialize DenoGrad (reuses the trained backbone)
denoiser = DenoGrad(model=model, criterion=criterion, device=torch.device('cuda'))

# 3. Fit and Transform
X_clean, y_clean, grad_x, grad_y = denoiser.fit_transform(
    X=X_noisy,          # numpy array (n_samples, n_features)
    y=y_noisy,          # numpy array (n_samples,) or (n_samples, n_targets)
    nrr=0.05,           # Noise Reduction Rate (η)
    nr_threshold=0.01,  # Gating threshold (τ)
    max_epochs=200
)

Time-Series Forecasting (Consensus Strategy)

For time-series data, DenoGrad employs a Consensus Strategy. Since a single time step $t$ participates in multiple overlapping sliding windows, DenoGrad accumulates the gradient from every window context and averages them to produce a single, temporally consistent update.

# 1. Initialize DenoGrad with a sequential model (e.g., LSTM)
denoiser = DenoGrad(model=lstm_model, criterion=nn.MSELoss())

# 2. Fit and Transform in Time-Series mode
X_clean, y_clean, _, _ = denoiser.fit_transform(
    X=X_ts_noisy,       # numpy array (total_timesteps, n_features)
    y=y_ts_noisy,       # numpy array (total_timesteps,)
    is_ts=True,          # Enable Time-Series mode
    window_size=24,      # Sliding window size (look-back period)
    future=1,            # Steps ahead the model predicts
    stride=1,            # Window stride
    nrr=0.01,
    nr_threshold=0.1,
    max_epochs=200
)

Pandas DataFrame Support

import pandas as pd

df = pd.DataFrame({"feat1": [...], "feat2": [...], "target": [...]})

X_clean, y_clean, _, _ = denoiser.fit_transform(
    X=df,
    y="target",          # Column name(s) to use as target
    nrr=0.05,
    max_epochs=100
)

How It Works

In standard training, gradients update model weights $\theta$ to minimize loss. DenoGrad inverts this: it freezes $\theta$ and treats the data instances themselves as the trainable parameters.

Core Update Rule

$$x' = x - \eta \cdot \frac{g_x}{|[g_x, g_y]|2} \cdot \mathbb{I}{\text{noisy}}, \qquad y' = y - \eta \cdot \frac{g_y}{|[g_x, g_y]|2} \cdot \mathbb{I}{\text{noisy}}$$

where $g_x = \nabla_x \mathcal{L}(f_\theta(x), y)$, $g_y = \nabla_y \mathcal{L}(f_\theta(x), y)$, and $\mathbb{I}_{\text{noisy}}$ is a binary gating mask.

Algorithm Components

  1. Input Optimization: Compute the gradient of the loss $\mathcal{L}$ with respect to the input features $X$ and targets $Y$ via backpropagation through the frozen model.

  2. Gating Mechanism: A threshold $\tau$ controls noise tolerance. Gradients are zeroed for any instance where $|f_\theta(x) - y| \leq \tau$, preserving high-confidence samples and preventing over-smoothing. This retained stochasticity acts as implicit regularization.

  3. Joint Normalization: Gradients for $X$ and $Y$ are concatenated and normalized by their joint $L_2$ norm. This ensures balanced corrections across all dimensions regardless of their scale.

  4. Consensus Strategy (Time-Series): For sequential data, gradient contributions from all overlapping windows covering time step $t$ are accumulated into global buffers $G_t$ with visit counters $C_t$. The final update is the averaged consensus direction:

$$x_t^{\text{new}} = x_t^{\text{old}} - \eta \cdot \frac{G_t}{C_t}$$

Theoretical Foundation: Spectral Bias

DenoGrad exploits the well-documented spectral bias of neural networks: DNNs inherently prioritize learning low-frequency patterns (the true signal) over high-frequency variations (noise) during SGD training. Even when trained on noisy data, a sufficiently regularized model captures the underlying data manifold. The gradients derived from this model therefore direct noisy instances toward this learned manifold.


API Reference

DenoGrad(model, criterion, device=None)

Parameter Type Description
model nn.Module Pre-trained PyTorch model (weights will be frozen).
criterion nn.modules.loss._Loss Loss function (e.g., nn.MSELoss()).
device torch.device, optional Compute device. Auto-detects CUDA if available.

The constructor automatically detects recurrent modules (RNN/LSTM/GRU) and sets the appropriate mode, and identifies CNN architectures for dimension handling.

.fit(X, y, is_ts=False, window_size=None, future=1, stride=1, flattening=False)

Configures the internal dataset strategy without running the denoising loop.

Parameter Type Default Description
X array / Tensor / DataFrame Input features.
y array / Tensor / str / list Targets. If X is a DataFrame, can be column name(s).
is_ts bool False Enable Time-Series mode.
window_size int None Sliding window size (required if is_ts=True).
future int 1 Forecasting horizon (steps ahead).
stride int 1 Stride between consecutive windows.
flattening bool False Flatten windows into 1D vectors (useful for MLP on TS data).

Returns self for method chaining.

.transform(nrr=0.05, nr_threshold=0.01, max_epochs=100, denoise_y=True, batch_size=1000, save_gradients=True)

Executes the denoising optimization loop.

Parameter Type Default Description
nrr float 0.05 Noise Reduction Rate ($\eta$). Step size for input corrections.
nr_threshold float 0.01 Gating Threshold ($\tau$). Instances with error $\leq \tau$ are skipped.
max_epochs int 100 Maximum optimization iterations.
denoise_y bool True Whether to also refine the target variable $Y$.
batch_size int 1000 Mini-batch size for the DataLoader.
save_gradients bool True Store per-epoch gradients for analysis.

Returns (X_denoised, y_denoised, grad_x_list, grad_y_list).

.fit_transform(X, y, ..., nrr=0.05, nr_threshold=0.01, max_epochs=100, ...)

Convenience method combining .fit() and .transform(). Accepts all parameters from both methods.

Hyperparameter Guidelines

Based on the empirical analysis in the paper:

Parameter Recommended Range Notes
nrr ($\eta$) 0.01 – 0.1 Higher rates converge faster; peak performance within ~200 iterations.
nr_threshold ($\tau$) 0.1 Robust baseline. Can be increased for larger aleatory margins.
max_epochs 100 – 500 Conservative rates (0.001) require 10x more iterations without matching performance.

Experimental Results

DenoGrad was evaluated on 10 real-world datasets (5 tabular, 5 time-series) against 7 state-of-the-art denoising baselines (DAE, DN-ResNet, PCA, WTD, EMD, KF, MA) using diverse downstream regressors (Ridge, kNN, XGBoost, DNN, TabPFN, LSTM, xLSTM, CNN-LSTM, DLinear).

Key Results (Friedman + Nemenyi test, $\alpha = 0.05$)

Metric DenoGrad Avg. Rank Best Competitor
Predictive Improvement (Imp%) 3.10 KF (1.50) — but with severe manifold distortion
Sliced Wasserstein Distance (SWD ↓) 1.70 PCA (2.30)
Feature Correlation ($\bar{\rho}$ ↑) 2.10 DN-ResNet (1.90)

DenoGrad uniquely occupies the optimal Pareto front: it achieves top-tier predictive gains while strictly preserving the topological integrity of the data. Methods that score higher in raw Imp% (e.g., KF at 98%+) do so at the cost of massive distributional distortion (SWD > 0.5, $\bar{\rho}$ < 0.3).

Highlights

  • ECL dataset: 98.4% average improvement across all downstream models.
  • Microsoft Stock: 97.6% improvement.
  • Time-Series: The only method maintaining >90% improvement consistently across LSTM, xLSTM, CNN-LSTM, DLinear, and XGBoost.

Datasets Used

Dataset Type Instances Features
House Prices Tabular 21,436 19
Lattice Physics Tabular 24,000 40
Parkinsons Tabular 5,875 20
RT-IoT 2022 Tabular 117,915 82
Support2 Tabular 8,579 33
Daily Climate Time-Series 1,576 4
ECL Time-Series 6,000 320
ETT Time-Series 17,420 7
Microsoft Stock Time-Series 2,192 5
WTH Time-Series 35,064 12

Citation

If you use DenoGrad in your research, please cite our paper:

@article{alonso2025denograd,
  title={DenoGrad: A Model-Agnostic Framework for Gradient-Based Data Refinement},
  author={Alonso-Ramos, J. Javier and Aguilera-Martos, Ignacio and Herrera-Poyatos, Andr{\'e}s and Herrera, Francisco},
  year={2025}
}

Acknowledgments

This work was supported by the University of Granada and the Andalusian Institute of Data Science and Computational Intelligence (DaSCI). It is part of the Project "Ethical, Responsible and General Purpose Artificial Intelligence" (IAFER) funded by the European Union Next Generation EU.


License

This project is licensed under the GNU Affero General Public License v3 — see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

denograd-1.1.0.tar.gz (24.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

denograd-1.1.0-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file denograd-1.1.0.tar.gz.

File metadata

  • Download URL: denograd-1.1.0.tar.gz
  • Upload date:
  • Size: 24.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for denograd-1.1.0.tar.gz
Algorithm Hash digest
SHA256 c840a012048c78dceedc3e69c56c040b58bb5dc2b080681f381dc94dc7ce5fd2
MD5 41250fca6366dfc520661f255cffcbb0
BLAKE2b-256 c7cfb909bfd108791b1ad6c045c32107d1940c795030cb91de08f2c4d40e392a

See more details on using hashes here.

File details

Details for the file denograd-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: denograd-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for denograd-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 453d88b57e89699298382b4081d7b907c217647fb5622fe0686b1ff2814f7d78
MD5 f81d2988b3de2845045baeff0aac5a7e
BLAKE2b-256 ff7448423db11b853f78cf7927d280f762c874ecc5d729dff56e053f3ae59779

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page