Instance noise reduction framework based on Deep Learning gradients agnostic to the network architecture.

These details have not been verified by PyPI

License
- OSI Approved :: GNU Affero General Public License v3
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

DenoGrad: A Model-Agnostic Framework for Gradient-Based Data Refinement

DenoGrad is a novel, model-agnostic framework designed to reduce noise in both input features and target variables by leveraging the gradients of a pre-trained Deep Learning model.

In the Data-Centric AI paradigm, traditional denoising often compromises data integrity by aggressively smoothing features. DenoGrad resolves this by leveraging the semantic spectral bias of neural networks. Instead of requiring clean ground truth data, it freezes the weights of your predictive backbone and iteratively backpropagates error corrections into the input space, effectively shifting noisy instances toward the learned data manifold.

Key Capabilities

Model-Agnostic: Works with any differentiable PyTorch model (MLP, LSTM, CNN, Transformers, TabPFN, etc.).
No Clean Ground Truth Required: Operates via self-supervised input optimization on the noisy dataset itself.
Dual Domain Support: Specialized handling for both Static Tabular data and Time-Series (via a Consensus Strategy).
Manifold Preservation: Achieves state-of-the-art error reduction while maintaining high structural fidelity (minimal $D_{KL}$ and high feature correlation).

📦 Installation

DenoGrad is available on PyPI and can be installed via pip:

pip install denograd

Alternatively, you can install the latest version from the source:

git clone [https://github.com/JJavier98/DenoGrad.git](https://github.com/JJavier98/DenoGrad.git)
cd DenoGrad
pip install -r requirements.txt

Requirements:

Python >= 3.8
PyTorch
NumPy
tqdm

🚀 Quick Start

DenoGrad integrates seamlessly into existing PyTorch pipelines. You simply need your noisy data and a model that has been trained (or partially trained) on it.

1. Static Tabular Data Example

import torch
import torch.nn as nn
from denograd import DenoGrad

# 1. Define your model and data
# The model should be pre-trained on the noisy data (or a similar distribution)
model = nn.Sequential(
    nn.Linear(10, 32),
    nn.ReLU(),
    nn.Linear(32, 1)
)
criterion = nn.MSELoss()

# Assume X_noisy and y_noisy are your numpy arrays
# model.load_state_dict(...) 

# 2. Initialize DenoGrad
denoiser = DenoGrad(model=model, criterion=criterion, device=torch.device('cuda'))

# 3. Fit and Transform
# nrr: Noise Reduction Rate (learning rate for the input)
# nr_threshold: Gating mechanism (don't correct if error < threshold)
X_clean, y_clean, grad_x, grad_y = denoiser.fit_transform(
    X=X_noisy, 
    y=y_noisy,
    nrr=0.05,           
    nr_threshold=0.01,  
    max_epochs=100
)

print("Denoising complete!")

2. Time-Series Example (Consensus Strategy)

For time-series data, DenoGrad employs a Consensus Strategy. Since a single time step $t$ appears in multiple sliding windows, DenoGrad accumulates gradients from all contexts and averages them to ensure temporal consistency.

# 1. Initialize DenoGrad with a recurrent model (e.g., LSTM)
denoiser = DenoGrad(model=lstm_model, criterion=criterion)

# 2. Fit and Transform with Time-Series parameters
X_clean, y_clean, _, _ = denoiser.fit_transform(
    X=X_ts_noisy, 
    y=y_ts_noisy,
    is_ts=True,          # Enable Time-Series mode
    window_size=24,      # Size of the look-back window used by the model
    stride=1,
    future=1,            # Steps ahead the model predicts
    nrr=0.01,
    max_epochs=50
)

🧠 How It Works

Traditional training updates weights ($\theta$) to minimize loss. DenoGrad inverts this process: it freezes $\theta$ and updates the input ($x$).

$$x_{new} \leftarrow x - \eta \cdot \nabla_x \mathcal{L}(f_\theta(x), y)$$

Input Optimization: The framework calculates the gradient of the loss with respect to the input features and targets.
Gating Mechanism: To prevent over-smoothing, DenoGrad only updates instances where the prediction error exceeds a user-defined threshold $\tau$ (aleatory margin).
Joint Normalization: Gradients for features and targets are normalized jointly to ensure balanced corrections across dimensions.
Consensus Strategy (Time-Series): For sequential data, gradients are accumulated across all sliding windows covering a time step $t$, and the final update is the average "consensus" direction.

🔧 API Reference

`DenoGrad` Class

`init(model, criterion, device=None)`

model: The pre-trained PyTorch model (nn.Module).
criterion: The loss function (e.g., nn.MSELoss).
device: computing device ('cpu' or 'cuda').

`fit_transform(X, y, ...)`

Configures the dataset strategy and executes the denoising loop.

General Parameters:

X, y: Input data (Numpy array, Torch Tensor, or Pandas DataFrame).
nrr (float, default=0.05): Noise Reduction Rate. Controls the step size of the correction ($\eta$).
nr_threshold (float, default=0.01): Noise Tolerance. Corrections are zeroed out if $|y_{pred} - y_{true}| \le \tau$.
max_epochs (int): Maximum number of optimization iterations.
denoise_y (bool, default=True): Whether to also refine the target variable.

Time-Series Specific Parameters:

is_ts (bool): Set to True for sequence data.
window_size (int): The input sequence length expected by the model.
future (int): The forecasting horizon (default 1).
flattening (bool): If true, flattens windows (useful for MLP backbones on TS data).

📄 Citation

If you use DenoGrad in your research, please cite our paper:

ON REVISION

👥 Acknowledgments

This work was supported by the University of Granada and the Andalusian Institute of Data Science and Computational Intelligence (DaSCI). It is part of the Project "Ethical, Responsible and General Purpose Artificial Intelligence" (IAFER) funded by the European Union Next Generation EU.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: GNU Affero General Public License v3
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

1.1.0

Mar 26, 2026

This version

1.0.2

Feb 26, 2026

1.0.1

Feb 23, 2026

1.0.0

Jan 30, 2026

1.0.0b3 pre-release

Jan 30, 2026

1.0.0b2 pre-release

Jan 30, 2026

1.0.0b1 pre-release

Jan 29, 2026

0.1.3

Jan 27, 2026

0.1.2

Jan 19, 2026

0.1.1

Jan 19, 2026

0.1.0

Jan 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

denograd-1.0.2.tar.gz (22.4 kB view details)

Uploaded Feb 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

denograd-1.0.2-py3-none-any.whl (22.1 kB view details)

Uploaded Feb 26, 2026 Python 3

File details

Details for the file denograd-1.0.2.tar.gz.

File metadata

Download URL: denograd-1.0.2.tar.gz
Upload date: Feb 26, 2026
Size: 22.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for denograd-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`57eb426e6e64bda26e5c6a77275dd280a43f99e45b4cf09ade824b2d6c2823f2`
MD5	`d0de6c38fc31ea97d6eeb0b34039e636`
BLAKE2b-256	`957e746a4a902f917cef808ea1d67b1b410fe84c0b17a5515d0ab76147709fe2`

See more details on using hashes here.

File details

Details for the file denograd-1.0.2-py3-none-any.whl.

File metadata

Download URL: denograd-1.0.2-py3-none-any.whl
Upload date: Feb 26, 2026
Size: 22.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for denograd-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f87f1aec07215530ac88fb0fdd1f4b9900282df7314c4f705bc306cdde88599c`
MD5	`e1025f7a5ea980ad2b6c480611bdd627`
BLAKE2b-256	`b9c28c826d64ce1eb80396c1706c2ee0cf40e52ad0e8a61e5f65f548a55e7db8`

See more details on using hashes here.

denograd 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

DenoGrad: A Model-Agnostic Framework for Gradient-Based Data Refinement

Key Capabilities

📦 Installation

🚀 Quick Start

1. Static Tabular Data Example

2. Time-Series Example (Consensus Strategy)

🧠 How It Works

🔧 API Reference

`DenoGrad` Class

`init(model, criterion, device=None)`

`fit_transform(X, y, ...)`

📄 Citation

👥 Acknowledgments

📝 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

denograd 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

DenoGrad: A Model-Agnostic Framework for Gradient-Based Data Refinement

Key Capabilities

📦 Installation

🚀 Quick Start

1. Static Tabular Data Example

2. Time-Series Example (Consensus Strategy)

🧠 How It Works

🔧 API Reference

DenoGrad Class

__init__(model, criterion, device=None)

fit_transform(X, y, ...)

📄 Citation

👥 Acknowledgments

📝 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`DenoGrad` Class

`init(model, criterion, device=None)`

`fit_transform(X, y, ...)`