A PyTorch implementation of dropconnect layers
Project description
Drop Connect
The paper DropConnectPapper introduces a regularization technique that is similar to Dropout, but instead of dropping out individual units, it drops out individual connections between units. This is done by applying a mask to the weights of the network, which is sampled from a Bernoulli distribution.
Installing
pip install dropconnect
Usage
from torch import Tensor
from dropconnect import Dropconnect
layer = Dropconnect(in_features=5, out_features=10, bias=True, p=0.5)
input = Tensor([[1,2,3,4,5],[2,3,4,5,6]])
output = layer(input)
print(output) # Can be used just like a drop-in replacement for linear layer.
Training
Let $X \in \mathbb{R}^{n \times d}$ a tensor with $n$ examples and $d$ features a $W \in \mathbb{R}^{l \times d}$ a tensor of weights.
For training, a mask matrix $M$ is created from a Bernoulli distribution to mask elements of the weight matrix $W$ , using the Hadamard product, in order to drop neuron connections instead of turning off neurons like in dropout
For a single example, the implementation is straightforward, just apply a mask $M$ to a weight tensor $W$. However, according to the paper: "A key component to successfully training with DropConnect is the selection of a different mask for each training example. Selecting a single mask for a subset of training examples, such as a mini-batch of 128 examples, does not regularize the model enough in practice."
Therefore, a mask tensor $M \in \mathbb{R}^{n \times l \times d}$ must be chosen, so the linear layer with DropConnect should be implemented as:
$$ \text{DropConnect}(X, W, M) = \begin{bmatrix} \frac{1}{1-p}\begin{bmatrix} x^1{}_1 & x^1{}_2 & \cdots & x^1{}_d \end{bmatrix} \left(\begin{bmatrix} m^{11}{}_1 & m^{11}{}_2 & \cdots & m^{11}{}_l \ m^{12}{}_1 & m^{12}{}_2 & \cdots & m^{12}{}_l \ \vdots & \vdots & \ddots & \vdots \ m^{1d}{}_1 & m^{1d}{}_2 & \cdots & m^{1d}{}_l \ \end{bmatrix} \odot \begin{bmatrix} w^1{}_1 & w^1{}_2 & \cdots & w^1{}_l \ w^2{}_1 & w^2{}_2 & \cdots & w^2{}_l \ \vdots & \vdots & \ddots & \vdots \ w^d{}_1 & w^d{}_2 & \cdots & w^d{}_l \ \end{bmatrix} \right) \ \ \ \vdots \ \ \frac{1}{1-p}\begin{bmatrix} x^n{}_1 & x^n{}_2 & \cdots & x^n{}_d \end{bmatrix} \left(\begin{bmatrix} m^{n1}{}_1 & m^{n1}{}_2 & \cdots & m^{n1}{}_l \ m^{n2}{}_1 & m^{n2}{}_2 & \cdots & m^{n2}{}_l \ \vdots & \vdots & \ddots & \vdots \ m^{nd}{}_1 & m^{nd}{}_2 & \cdots & m^{nd}{}_l \ \end{bmatrix} \odot \begin{bmatrix} w^1{}_1 & w^1{}_2 & \cdots & w^1{}_l \ w^2{}_1 & w^2{}_2 & \cdots & w^2{}_l \ \vdots & \vdots & \ddots & \vdots \ w^d{}_1 & w^d{}_2 & \cdots & w^d{}_l \ \end{bmatrix} \right) \ \end{bmatrix} $$
Backpropagation
In order to update the weight matrix $W$ in a DropConnect layer, the mask is applied to the gradient to update only those elements that were active in the forward pass. but this is already done by the automatic differentiation in Pytorch, since if $J$ is the gradient coming from the linear operation, the gradient propagated by the Hadamard product with respect to $W$ will be:
$$ J \odot M $$
So there is no need to implement an additional backpropagation operation, and only the Hadamard product already provided by Pytorch is needed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dropconnect-0.1.1.tar.gz.
File metadata
- Download URL: dropconnect-0.1.1.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.12.1 Linux/6.11.0-1012-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2e3ad71427066cb6fef207c448bacdec5fc96a2d30c1ac9d8a1ba3b501e1c611
|
|
| MD5 |
9c2d54601f34d475bf5b8bde324364fc
|
|
| BLAKE2b-256 |
82037eee30a0f9e4113ad83342d68867e87512ab44a7fa920df4928046e442e3
|
File details
Details for the file dropconnect-0.1.1-py3-none-any.whl.
File metadata
- Download URL: dropconnect-0.1.1-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.12.1 Linux/6.11.0-1012-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9b0bab3e642b70e12b161986b8d8ae8876c8486a0c3c5226b9e69c8f86865af
|
|
| MD5 |
cd8f33a857a71a47d7699aa3b39fd38f
|
|
| BLAKE2b-256 |
e01c7324afb433cf4d9ef4294953f6a773037a9544152d790cc74c1b5bb717f6
|