A PyTorch library of modern embedding strategies missing from torch.nn

These details have not been verified by PyPI

Project links

Project description

torchembed

Modern embedding strategies for PyTorch — the ones missing from torch.nn.

torch.nn gives you nn.Embedding (a lookup table). That's it. The moment you work with continuous inputs, modern transformer architectures, coordinates, time, or tabular data, you're on your own — copy-pasting RoPE implementations across projects.

torchembed is a single, well-tested, pip-installable home for all of them.

Installation

pip install torchembed

Requires Python ≥ 3.9 and PyTorch ≥ 2.0. No other required dependencies.

What's included

Module	Class	Use case
`positional`	`RotaryEmbedding`	Modern LLMs (LLaMA, Mistral, Falcon)
`positional`	`ALiBiEmbedding`	Long-context models (BLOOM, MPT)
`positional`	`SinusoidalEmbedding`	Classic Transformers
`positional`	`LearnedPositionalEmbedding`	BERT, GPT-2
`fourier`	`RandomFourierFeatures`	Kernel approximation, coordinate encoding
`fourier`	`LearnedFourierFeatures`	Trainable frequency decomposition
`fourier`	`GaussianFourierProjection`	Diffusion models (timestep embedding)
`categorical`	`EntityEmbedding`	Tabular categorical features
`categorical`	`MultiCategoricalEmbedding`	Multiple categorical columns at once
`patch`	`PatchEmbedding`	Vision Transformers (ViT)
`patch`	`TubeletEmbedding`	Video Transformers (VideoMAE, ViViT)
`temporal`	`CyclicEmbedding`	Hour, day, month (cyclic features)
`temporal`	`TimestampEmbedding`	Continuous timestamps
`temporal`	`FrequencyEmbedding`	Time series, periodic signals

Quick start

from torchembed import (
    RotaryEmbedding,
    ALiBiEmbedding,
    SinusoidalEmbedding,
    RandomFourierFeatures,
    PatchEmbedding,
    EntityEmbedding,
    MultiCategoricalEmbedding,
    GaussianFourierProjection,
    CyclicEmbedding,
)

Examples

Rotary Embedding (RoPE) — LLaMA / Mistral style

import torch
from torchembed import RotaryEmbedding

rope = RotaryEmbedding(dim=64)  # head_dim

# Inside your attention layer:
q = torch.randn(batch, heads, seq_len, 64)
k = torch.randn(batch, heads, seq_len, 64)
q, k = rope(q, k)  # apply rotation in-place

RoPE has no trainable parameters and preserves vector norms (it's a pure rotation). The default base of 10,000 matches the original paper; use base=500_000 for LLaMA 3.

ALiBi — long context with length extrapolation

from torchembed import ALiBiEmbedding

alibi = ALiBiEmbedding(num_heads=8)

# After computing raw attention scores:
attn_scores = q @ k.transpose(-2, -1) / math.sqrt(head_dim)
attn_scores = alibi(attn_scores)   # adds learned distance penalty
attn_weights = attn_scores.softmax(-1)

Gaussian Fourier Projection — diffusion model timestep embedding

from torchembed import GaussianFourierProjection
import torch.nn as nn

class DiffusionTimeEmbedding(nn.Module):
    def __init__(self, embed_dim):
        super().__init__()
        self.fourier = GaussianFourierProjection(embed_dim=embed_dim, scale=16)
        self.mlp = nn.Sequential(
            nn.Linear(embed_dim, embed_dim * 4),
            nn.SiLU(),
            nn.Linear(embed_dim * 4, embed_dim),
        )

    def forward(self, t):
        return self.mlp(self.fourier(t))

t_emb = DiffusionTimeEmbedding(embed_dim=256)
t = torch.rand(32)   # normalized timesteps
emb = t_emb(t)       # (32, 256) — condition your UNet on this

ViT Patch Embedding

from torchembed import PatchEmbedding

patch_emb = PatchEmbedding(
    image_size=224,
    patch_size=16,
    embed_dim=768,
)

images = torch.randn(4, 3, 224, 224)
tokens = patch_emb(images)    # (4, 196, 768)
print(patch_emb.num_patches)  # 196

Tabular categorical features

from torchembed import MultiCategoricalEmbedding

# A tabular dataset with 3 categorical columns:
# country (50 unique values), day of week (7), product category (120)
emb = MultiCategoricalEmbedding(cardinalities=[50, 7, 120])
print(emb.output_dim)   # sum of auto-sized embed dims

x = torch.stack([country_ids, dow_ids, category_ids], dim=1)   # (batch, 3)
features = emb(x)   # (batch, output_dim)

Cyclic time features

from torchembed import CyclicEmbedding
import torch

hour_enc  = CyclicEmbedding(period=24)
dow_enc   = CyclicEmbedding(period=7)
month_enc = CyclicEmbedding(period=12)

hour   = torch.tensor([0.0, 6.0, 12.0, 18.0])
dow    = torch.tensor([0.0, 1.0, 2.0, 3.0])
month  = torch.tensor([1.0, 4.0, 7.0, 10.0])

time_features = torch.cat([
    hour_enc(hour),    # (4, 2)
    dow_enc(dow),      # (4, 2)
    month_enc(month),  # (4, 2)
], dim=-1)             # (4, 6)

Random Fourier Features for coordinate encoding

from torchembed import RandomFourierFeatures

# Encode 2D spatial coordinates for a neural field / NeRF-style model
rff = RandomFourierFeatures(in_features=2, out_features=256, sigma=1.0)

coords = torch.rand(1024, 2)   # (x, y) pairs in [0, 1]
features = rff(coords)          # (1024, 256)

Design principles

Everything is an nn.Module. You can use any embedding as a layer in a larger model, save/load it with state_dict, move it across devices, and wrap it with torch.compile.

No required dependencies beyond PyTorch. torchembed has exactly one required dependency: PyTorch itself. We don't pull in transformers, numpy, or anything else.

Device-agnostic. No .cuda() calls inside the library. Move your model to whatever device you want — the embeddings follow.

Bring just what you need. Every embedding class is independent. Use one, use all, use none — no framework lock-in.

Running tests

pip install torchembed[dev]
pytest

Contributing

Contributions welcome! If there's an embedding strategy you find yourself copy-pasting into projects, open a PR. Please include:

The module with a clear docstring and paper reference
Tests covering shape, gradients, and key mathematical properties
An example in the README

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

Jun 7, 2026

0.2.3

May 30, 2026

0.2.2

May 30, 2026

0.2.1

May 30, 2026

0.2.0

May 30, 2026

This version

0.1.0

Mar 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchembed-0.1.0.tar.gz (17.6 kB view details)

Uploaded Mar 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

torchembed-0.1.0-py3-none-any.whl (5.3 kB view details)

Uploaded Mar 1, 2026 Python 3

File details

Details for the file torchembed-0.1.0.tar.gz.

File metadata

Download URL: torchembed-0.1.0.tar.gz
Upload date: Mar 1, 2026
Size: 17.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for torchembed-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`af5c2097d9e5eafcefd108316ed628251a3aa3f2f47d46699e4d1b67bfe4b096`
MD5	`21836fa83adcef652a273ae658a39567`
BLAKE2b-256	`fe6bbcde18df0b85bbc52a62b238957af0d424a65635195e8f5b721a53f4939f`

See more details on using hashes here.

File details

Details for the file torchembed-0.1.0-py3-none-any.whl.

File metadata

Download URL: torchembed-0.1.0-py3-none-any.whl
Upload date: Mar 1, 2026
Size: 5.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for torchembed-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aeba649dfdd2105ba3c4770a4e5581bd793162562060a319c259a8473cf2a23e`
MD5	`2034236f43f3f7e30f8f465d72af8d91`
BLAKE2b-256	`dfbc0525dd759d6590e8799e1a2774faa83cae0d921d8d85b78659ff7477359f`

See more details on using hashes here.

torchembed 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

torchembed

Installation

What's included

Quick start

Examples

Rotary Embedding (RoPE) — LLaMA / Mistral style

ALiBi — long context with length extrapolation

Gaussian Fourier Projection — diffusion model timestep embedding

ViT Patch Embedding

Tabular categorical features

Cyclic time features

Random Fourier Features for coordinate encoding

Design principles

Running tests

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes