Skip to main content

1D, 2D, and 3D Sinusodal Positional Encodings in PyTorch

Project description

1D, 2D, and 3D Sinusodal Postional Encoding Pytorch

This is an implemenation of 1D, 2D, and 3D sinusodal positional encoding, being able to encode on tensors of the form (batchsize, x, ch), (batchsize, x, y, ch), and (batchsize, x, y, z, ch), where the positional encodings will be added to the ch dimension. The Attention is All You Need allowed for positional encoding in only one dimension, however, this works to extend this to 2 and 3 dimensions.

New: This also works on tensors of the form (batchsize, x, ch), etc. For inputs of this type, include the word Permute before the number in the class; e.g. for a 1D input of size (batchsize, x, ch), do PositionalEncodingPermute1D instead of PositionalEncoding1D.

To install, simply run:

pip install positional-encodings

Specifically, the formula for inserting the positional encoding will be as follows:

1D:

PE(x,2i) = sin(x/10000^(2i/D))
PE(x,2i+1) = cos(x/10000^(2i/D))

Where:
x is a point in 2d space
i is in [0, D/2), where D is the size of the ch dimension

2D:

PE(x,y,2i) = sin(x/10000^(4i/D))
PE(x,y,2i+1) = cos(x/10000^(4i/D))
PE(x,y,2j+D/2) = sin(y/10000^(4j/D))
PE(x,y,2j+1+D/2) = cos(y/10000^(4j/D))

Where:
(x,y) is a point in 2d space
i,j is in [0, D/4), where D is the size of the ch dimension

3D:

PE(x,y,z,2i) = sin(x/10000^(6i/D))
PE(x,y,z,2i+1) = cos(x/10000^(6i/D))
PE(x,y,z,2j+D/3) = sin(y/10000^(6j/D))
PE(x,y,z,2j+1+D/3) = cos(y/10000^(6j/D))
PE(x,y,z,2k+2D/3) = sin(z/10000^(6k/D))
PE(x,y,z,2k+1+2D/3) = cos(z/10000^(6k/D))

Where:
(x,y,z) is a point in 3d space
i,j,k is in [0, D/6), where D is the size of the ch dimension

This is just a natural extension of the 2D positional encoding used in this paper.

Don't worry if the input is not divisible by 2 (1D), 4 (2D), or 6 (3D); all the necessary padding will be taken care of.

Usage:

import torch
from positional_encodings import PositionalEncoding1D, PositionalEncoding2D, PositionalEncoding3D

p_enc_1d = PositionalEncoding1D(10)
x = torch.zeros((1,6,10))
print(p_enc_1d(x).shape) # (1, 6, 10)

p_enc_2d = PositionalEncoding2D(8)
y = torch.zeros((1,6,2,8))
print(p_enc_2d(y).shape) # (1, 6, 2, 8)

p_enc_3d = PositionalEncoding3D(11)
z = torch.zeros((1,5,6,4,11))
print(p_enc_3d(z).shape) # (1, 5, 6, 4, 11)

And for tensors of the form (batchsize, ch, x), etc:

import torch
from positional_encodings import PositionalEncodingPermute1D, PositionalEncodingPermute2D, PositionalEncodingPermute3D

p_enc_1d = PositionalEncodingPermute1D(10)
x = torch.zeros((1,10,6))
print(p_enc_1d(x).shape) # (1, 10, 6)

p_enc_2d = PositionalEncodingPermute2D(8)
y = torch.zeros((1,8,6,2))
print(p_enc_2d(y).shape) # (1, 8, 6, 2)

p_enc_3d = PositionalEncodingPermute3D(11)
z = torch.zeros((1,11,5,6,4))
print(p_enc_3d(z).shape) # (1, 11, 5, 6, 4)

Thank you

Thank you for this repo for inspriration of this method.

Citations

1D:

@inproceedings{vaswani2017attention,
  title={Attention is all you need},
  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
  booktitle={Advances in neural information processing systems},
  pages={5998--6008},
  year={2017}
}

2D:

@misc{wang2019translating,
    title={Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training},
    author={Zelun Wang and Jyh-Charn Liu},
    year={2019},
    eprint={1908.11415},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

3D: Coming soon!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

positional_encodings-2.0.0.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

positional_encodings-2.0.0-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file positional_encodings-2.0.0.tar.gz.

File metadata

  • Download URL: positional_encodings-2.0.0.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.6

File hashes

Hashes for positional_encodings-2.0.0.tar.gz
Algorithm Hash digest
SHA256 4e9945dbc9f439ea4c588c22aaa17dcb886592bd73744c7164d00bb16fc102ee
MD5 4da04585d076c79f3e803d242a4d8e46
BLAKE2b-256 4ed90297664ab0028e3530e1c01271f982bd36dd86bbaa3b72c2420b44a9528a

See more details on using hashes here.

File details

Details for the file positional_encodings-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: positional_encodings-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.6

File hashes

Hashes for positional_encodings-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 928771ea53d98ea4da54a9fe861cd130abcafa8380e5b6f6ec4f680c2bfb7d17
MD5 3e03ffe338882023b7a67a38106a6771
BLAKE2b-256 f32baefeda87d018bd59c94be4442cd8b59a3255fe396011b71c0f328daec450

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page