Skip to main content

A PyTorch implementation of dropconnect layers

Project description

Archieved

This implementation has issues. I will be creating an installable package with a better implementation and documentation soon.

Drop Connect

The paper DropConnectPapper introduces a regularization technique that is similar to Dropout, but instead of dropping out individual units, it drops out individual connections between units. This is done by applying a mask to the weights of the network, which is sampled from a Bernoulli distribution.

DropConnectImage

Installing

pip install dropconnect

Usage

from torch import Tensor
from dropconnect import Dropconnect

layer = Dropconnect(in_features=5, out_features=10, bias=True, p=0.5)
input = Tensor([[1,2,3,4,5],[2,3,4,5,6]])
output = layer(input)
print(output) # Can be used just like a drop-in replacement for linear layer.

Training

Let $X \in \mathbb{R}^{n \times d}$ a tensor with $n$ examples and $d$ features a $W \in \mathbb{R}^{l \times d}$ a tensor of weights.

For training, a mask matrix $M$ is created from a Bernoulli distribution to mask elements of the weight matrix $W$ , using the Hadamard product, in order to drop neuron connections instead of turning off neurons like in dropout

For a single example, the implementation is straightforward, just apply a mask $M$ to a weight tensor $W$. However, according to the paper: "A key component to successfully training with DropConnect is the selection of a different mask for each training example. Selecting a single mask for a subset of training examples, such as a mini-batch of 128 examples, does not regularize the model enough in practice."

Therefore, a mask tensor $M \in \mathbb{R}^{n \times l \times d}$ must be chosen, so the linear layer with DropConnect should be implemented as:

$$ \text{DropConnect}(X, W, M) = \begin{bmatrix} \frac{1}{1-p}\begin{bmatrix} x^1{}_1 & x^1{}_2 & \cdots & x^1{}_d \end{bmatrix} \left(\begin{bmatrix} m^{11}{}_1 & m^{11}{}_2 & \cdots & m^{11}{}_l \ m^{12}{}_1 & m^{12}{}_2 & \cdots & m^{12}{}_l \ \vdots & \vdots & \ddots & \vdots \ m^{1d}{}_1 & m^{1d}{}_2 & \cdots & m^{1d}{}_l \ \end{bmatrix} \odot \begin{bmatrix} w^1{}_1 & w^1{}_2 & \cdots & w^1{}_l \ w^2{}_1 & w^2{}_2 & \cdots & w^2{}_l \ \vdots & \vdots & \ddots & \vdots \ w^d{}_1 & w^d{}_2 & \cdots & w^d{}_l \ \end{bmatrix} \right) \ \ \frac{1}{1-p}\begin{bmatrix} x^2{}_1 & x^2{}_2 & \cdots & x^2{}_d \end{bmatrix} \left(\begin{bmatrix} m^{21}{}_1 & m^{21}{}_2 & \cdots & m^{21}{}_l \ m^{22}{}_1 & m^{22}{}_2 & \cdots & m^{22}{}_l \ \vdots & \vdots & \ddots & \vdots \ m^{2d}{}_1 & m^{2d}{}_2 & \cdots & m^{2d}{}_l \ \end{bmatrix} \odot \begin{bmatrix} w^1{}_1 & w^1{}_2 & \cdots & w^1{}_l \ w^2{}_1 & w^2{}_2 & \cdots & w^2{}_l \ \vdots & \vdots & \ddots & \vdots \ w^d{}_1 & w^d{}_2 & \cdots & w^d{}_l \ \end{bmatrix} \right) \ \ \frac{1}{1-p}\begin{bmatrix} x^n{}_1 & x^n{}_2 & \cdots & x^n{}_d \end{bmatrix} \left(\begin{bmatrix} m^{n1}{}_1 & m^{n1}{}_2 & \cdots & m^{n1}{}_l \ m^{n2}{}_1 & m^{n2}{}_2 & \cdots & m^{n2}{}_l \ \vdots & \vdots & \ddots & \vdots \ m^{nd}{}_1 & m^{nd}{}_2 & \cdots & m^{nd}{}_l \ \end{bmatrix} \odot \begin{bmatrix} w^1{}_1 & w^1{}_2 & \cdots & w^1{}_l \ w^2{}_1 & w^2{}_2 & \cdots & w^2{}_l \ \vdots & \vdots & \ddots & \vdots \ w^d{}_1 & w^d{}_2 & \cdots & w^d{}_l \ \end{bmatrix} \right) \ \end{bmatrix} $$

Backpropagation

In order to update the weight matrix $W$ in a DropConnect layer, the mask is applied to the gradient to update only those elements that were active in the forward pass. but this is already done by the automatic differentiation in Pytorch, since if $J$ is the gradient coming from the linear operation, the gradient propagated by the Hadamard product with respect to $W$ will be:

$$ J \odot M $$

So there is no need to implement an additional backpropagation operation, and only the Hadamard product already provided by Pytorch is needed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dropconnect-0.1.0.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dropconnect-0.1.0-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file dropconnect-0.1.0.tar.gz.

File metadata

  • Download URL: dropconnect-0.1.0.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.12.1 Linux/6.11.0-1012-azure

File hashes

Hashes for dropconnect-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9de6fb12665c3ac6ea51dad3cd72794c15ffe7628ba545feb792b5b64045df9b
MD5 406eb5090ccb9bd7bd13403e0254859c
BLAKE2b-256 47d37e5b7889f6c1aa8ef3f50b645bc0c235ea96cdb5bdca59a7f0d31dd2fd3a

See more details on using hashes here.

File details

Details for the file dropconnect-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dropconnect-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.12.1 Linux/6.11.0-1012-azure

File hashes

Hashes for dropconnect-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb5fd87d10eaabedb45e79026d9a5423dc8349bc4360acc00864044e60e7cd88
MD5 e0db34ed079f5fe1b8fcc11ac1a80a04
BLAKE2b-256 11f440581bb0107a924791c9a6a52c4480ae1be518d5f041ca6d9e1650a48cd5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page