Skip to main content

PoPE

Project description

PoPE-pytorch

Efficient implementation (and explorations) into polar coordinate positional embedding (PoPE) - from Gopalakrishnan et al. under Schmidhuber

Install

$ pip install PoPE-pytorch

Usage

import torch
from PoPE_pytorch import PoPE

# define pope

pope = PoPE(64, heads = 8)

# pass in sequence length

pos_emb = pope(1024)

# queries and keys in attention

q = torch.randn(1, 8, 1024, 64)
k = torch.randn(1, 8, 1024, 64)

# training

rotated_q, rotated_k = pope.apply_pope_to_qk(pos_emb, q, k)

# inference

rotated_q, rotated_k = pope.apply_pope_to_qk(pos_emb, q[..., -1:, :], k)

Axial PoPE

For images, video, etc. where multiple dimensions are needed, you can use AxialPoPE. The feature dimension will be split across these axial dimensions.

You can either pass in the positions manually, or just pass the dimensions as a tuple, in which case the grid positions will be automatically generated.

import torch
from PoPE_pytorch import AxialPoPE

# axial pope for images (e.g. 16x16)
# split 64 dim into 32 (x) and 32 (y)

pope = AxialPoPE(
    dim = 64,
    heads = 8,
    axial_dims = (32, 32)
)

pos_emb = pope((16, 16)) # (256, 64) frequencies

# for video (e.g. 8 frames, 16x16 frames)
# split 96 dim into 32 (t), 32 (x), 32 (y)

pope_video = AxialPoPE(
    dim = 96,
    heads = 8,
    axial_dims = (32, 32, 32)
)

pos_emb_video = pope_video((8, 16, 16)) # (2048, 96) frequencies

# queries and keys
# then apply to q, k as usual

q = torch.randn(1, 8, 2048, 96)
k = torch.randn(1, 8, 2048, 96)

rotated_q, rotated_k = AxialPoPE.apply_pope_to_qk(pos_emb_video, q, k)

Fused Attention Similarity

import torch
from PoPE_pytorch import PoPE, compute_attn_similarity

# define pope

pope = PoPE(dim = 64, heads = 8).cuda()

# get rotations

pos_emb = pope(1024)

# queries and keys

q = torch.randn(1, 8, 1024, 64).cuda()
k = torch.randn(1, 8, 1024, 64).cuda()

# fused attention similarity, avoiding expanding 64 to 128

sim = compute_attn_similarity(q, k, pos_emb) # (1, 8, 1024, 1024)

attn = sim.softmax(dim = -1) # the usual in attention..

Fused Flash Attention

import torch
from PoPE_pytorch import PoPE, flash_attn_with_pope

# pope

pope = PoPE(dim = 32, heads = 8).cuda()

# queries, keys, values for attention

q = torch.randn(2, 8, 1024, 64).cuda()
k = torch.randn(2, 8, 1024, 64).cuda()
v = torch.randn(2, 8, 1024, 64).cuda()

pos_emb = pope(1024)

mask = torch.ones((2, 1024)).bool().cuda()

out = flash_attn_with_pope(q, k, v, pos_emb = pos_emb, causal = True, mask = mask)

assert out.shape == (2, 8, 1024, 64)

Citations

@misc{gopalakrishnan2025decouplingwhatwherepolar,
    title   = {Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings},
    author  = {Anand Gopalakrishnan and Robert Csordás and Jürgen Schmidhuber and Michael C. Mozer},
    year    = {2025},
    eprint  = {2509.10534},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG},
    url     = {https://arxiv.org/abs/2509.10534},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pope_pytorch-0.1.0.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pope_pytorch-0.1.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file pope_pytorch-0.1.0.tar.gz.

File metadata

  • Download URL: pope_pytorch-0.1.0.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for pope_pytorch-0.1.0.tar.gz
Algorithm Hash digest
SHA256 075489d3d24143718e10c1f390465e68dd2b6eb73c6093a3827bdfe58a3d0f0e
MD5 d30cb814cd258e58ff38d9a9fbc904a9
BLAKE2b-256 c9daa4a97363fa72b0d97085f0f594611cbdb35b03c058543171558cd7de9410

See more details on using hashes here.

File details

Details for the file pope_pytorch-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pope_pytorch-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcb12348975dc60a7fe9ca5c844084bb7aa8c50ef48b4319236b611fd10c93d8
MD5 97322934a5ba36140d4d5f2b67b8d04a
BLAKE2b-256 576221c634ef4aef6e2d1151857ba28da49e95751cb976dad586334ba273db85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page