A PyTorch implementation of De-noising Diffusion Probabilistic Models

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

license

ProbabilisticDiffusion

This is a PyTorch implementation of the training algorithm found in Denoising Diffusion Probabilistic Models.

Specifically, we implement the following training procedure:

Where $\epsilon_\theta$ represents the user defined model with learnable parameters $\theta$.

Installation

pip install:

pip install ProbabilisticDiffusion

For any additional needs, sdist and bdist can be found in the GitHub repo.

Usage

The data we use for the below examples is a set of randomly generated points points lying on a circle of radius 2 and added i.i.d gaussian noise with SD of 0.3 to x and y axes:

The Jupyter Notebook with this example can be found on GitHub here.

Defining Model

Defining $\epsilon_\theta$

Below we define our model with the parameters we would like to learn. In this case we use a simple architecture with a combination of PyTorch Linear layers with Softplus activations, as well as an embedding to take into account the timestep, $t$, which we also include in the input (the y value).

import torch.nn as nn
import torch.nn.functional as F


class ConditionalLinear(nn.Module):
    def __init__(self, num_in, num_out, n_steps):
        super(ConditionalLinear, self).__init__()
        self.num_out = num_out
        self.lin = nn.Linear(num_in, num_out)
        self.embed = nn.Embedding(n_steps, num_out)
        self.embed.weight.data.uniform_()

    def forward(self, x, y):
        out = self.lin(x)
        gamma = self.embed(y)
        out = gamma.view(-1, self.num_out) * out
        return out
class ConditionalModel(nn.Module):
    def __init__(self, n_steps):
        super(ConditionalModel, self).__init__()
        self.lin1 = ConditionalLinear(2, 128, n_steps)
        self.lin2 = ConditionalLinear(128, 128, n_steps)
        self.lin3 = nn.Linear(128, 2)
    def forward(self, x, y):
        x = F.softplus(self.lin1(x, y))
        x = F.softplus(self.lin2(x, y))
        return self.lin3(x)

Defining Diffusion Based Learning Model

We define our diffusion based model with 200 timesteps, MSE loss (although the original algorithm specifies just SSE but we found that MSE works as well), beta start and end values of 1e-5, 1e-2 respectively with a linear schedule, and use the ADAM optimizer with a learning rate of 1e-3.

from ProbabilisticDiffusion import Diffusion
import torch

n_steps=200
model = ConditionalModel(n_steps)
loss = torch.nn.MSELoss(reduction='mean') # We use MSE for the loss which adheres to the gradient step procedure defined
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) # ADAM Optimizer Parameters for learning
diffusion = Diffusion(data, n_steps, 1e-5, 1e-2, 'linear', model, loss, optimizer) # Note the (1e-5, 1e-2) are Beta start and end values

Forward Sampling

This allows us to see the forward diffusion process and ensure that our n_steps parameter is large enough. We want to see the data morph into standard gaussian distributed points by the last time step.

import scipy.stats as stats

noised = diffusion.forward(199, s=5)
stats.probplot(noised[:,0], dist="norm", plot=plt)
plt.show()

Training

We train with batch size of 1,000 for 10,000 epochs.

diffusion.train(1000, 10000)

Sampling New Data

We can sample new data based on the learned model via the following method:

new_x = diffusion.sample(1000, 50, s=3)

This method generated 1000 new samples and plotted at an interval of 50. In addition, we can specify which points to keep from these new samples, 'last' will only keep the last timestep of samples, 'all', will keep all timesteps, and for more granularity, one can specify a tuple of integer values corresponding to the desired timesteps to keep.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

2.0.0

Nov 20, 2023

1.0.3

Dec 27, 2022

1.0.2

Dec 22, 2022

1.0.1

Dec 22, 2022

1.0.0

Dec 22, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ProbabilisticDiffusion-2.0.0.tar.gz (6.2 kB view hashes)

Uploaded Nov 20, 2023 Source

Built Distribution

ProbabilisticDiffusion-2.0.0-py3-none-any.whl (6.9 kB view hashes)

Uploaded Nov 20, 2023 Python 3

Hashes for ProbabilisticDiffusion-2.0.0.tar.gz

Hashes for ProbabilisticDiffusion-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`baf282db14df72d002dc8023ae197b990364897a1f1bad58ec8269f5b04686f6`
MD5	`9cc93f2f9be967120a85cf2730085f38`
BLAKE2b-256	`07ca654fc0f260b28a2c6dc17ba44ea6a506a25788d2c518ae78d93ad9cbb50a`

Hashes for ProbabilisticDiffusion-2.0.0-py3-none-any.whl

Hashes for ProbabilisticDiffusion-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c55ca19d7f63535976e226e4d00de77e9ad5a3a392679fafac52dc9bd0ea1e76`
MD5	`7d61a8ce031ae4cb315f56a32f0a7b14`
BLAKE2b-256	`00c13cb8f080c9d25beba828fbfc6a22b1e3cb2b9107b532ee9111328d95d1f0`