221 projects
pi-zero-pytorch
π0 in Pytorch
transfusion-pytorch
Transfusion in Pytorch
alphafold3-pytorch
Alphafold 3 - Pytorch
equiformer-pytorch
Equiformer - SE3/E3 Graph Attention Transformer for Molecules and Proteins
streaming-deep-rl
Streaming Deep Reinforcement Learning
SAC-pytorch
Soft Actor Critic - Pytorch
q-transformer
Q-Transformer
minGRU-pytorch
minGRU
lvsm-pytorch
LVSM - Pytorch
vector-quantize-pytorch
Vector Quantization - Pytorch
maskbit-pytorch
MaskBit
rotary-embedding-torch
Rotary Embedding - Pytorch
x-transformers
X-Transformers - Pytorch
spline-based-transformer
Spline Based Transformer
audiolm-pytorch
AudioLM - Language Modeling Approach to Audio Generation from Google Research - Pytorch
nGPT-pytorch
nGPT
e2-tts-pytorch
E2-TTS in Pytorch
rectified-flow-pytorch
Rectified Flow in Pytorch
autoregressive-diffusion-pytorch
Autoregressive Diffusion - Pytorch
soundstorm-pytorch
SoundStorm - Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
vit-pytorch
Vision Transformer (ViT) - Pytorch
ring-attention-pytorch
Ring Attention - Pytorch
adam-atan2-pytorch
Adam-atan2 for Pytorch
ema-pytorch
Easy way to keep track of exponential moving average version of your pytorch module
mixture-of-attention
Mixture of Attention
meshgpt-pytorch
MeshGPT Pytorch
enformer-pytorch
Enformer - Pytorch
denoising-diffusion-pytorch
Denoising Diffusion Probabilistic Models - Pytorch
infini-transformer-pytorch
Infini-Transformer in Pytorch
imagen-pytorch
Imagen - unprecedented photorealism × deep level of language understanding
robotic-transformer-pytorch
Robotic Transformer - Pytorch
classifier-free-guidance-pytorch
Classifier Free Guidance - Pytorch
quartic-transformer
Quartic Transformer
scaling-vin-pytorch
Scaling Value Iteration Networks
MEGABYTE-pytorch
MEGABYTE - Pytorch
mlp-mixer-pytorch
MLP Mixer - Pytorch
CALM-Pytorch
CALM - Pytorch
local-attention
Local attention, window with lookback, for language modeling
CoLT5-attention
Conditionally Routed Attention
light-recurrent-unit-pytorch
Light Recurrent Unit
sinkhorn-router-pytorch
Sinkhorn Router - Pytorch
grokfast-pytorch
Grokfast
mmdit
MMDiT
stylegan2-pytorch
StyleGan2 in Pytorch
PEER-pytorch
PEER - Pytorch
slot-attention
Implementation of Slot Attention in Pytorch
block-recurrent-transformer-pytorch
Block Recurrent Transformer - Pytorch
taylor-series-linear-attention
Taylor Series Linear Attention
product-key-memory
Product Key Memory
phenaki-pytorch
Phenaki - Pytorch
pytorch-custom-utils
Pytorch Custom Utils
frame-averaging-pytorch
Frame Averaging
lumiere-pytorch
Lumiere
magvit2-pytorch
MagViT2 - Pytorch
byol-pytorch
Self-supervised contrastive learning made simple
titok-pytorch
TiTok - Pytorch
gateloop-transformer
GateLoop Transformer
lion-pytorch
Lion Optimizer - Pytorch
mogrifier
Implementation of Mogrifier circuit from Deepmind
st-moe-pytorch
ST - Mixture of Experts - Pytorch
En-transformer
E(n)-Equivariant Transformer
iTransformer
iTransformer - Inverted Transformer Are Effective for Time Series Forecasting
self-reasoning-tokens-pytorch
Self Reasoning Tokens
make-a-video-pytorch
Make-A-Video - Pytorch
x-unet
X-Unet
video-diffusion-pytorch
Video Diffusion - Pytorch
soft-moe-pytorch
Soft MoE - Pytorch
deformable-attention
Deformable Attention - from the paper "Vision Transformer with Deformable Attention"
BS-RoFormer
BS-RoFormer - Band-Split Rotary Transformer for SOTA Music Source Separation
self-rewarding-lm-pytorch
Self Rewarding LM - Pytorch
muse-maskgit-pytorch
MUSE - Text-to-Image Generation via Masked Generative Transformers, in Pytorch
RIN-pytorch
RIN - Recurrent Interface Network - Pytorch
h-transformer-1d
H-Transformer 1D - Pytorch
recurrent-memory-transformer-pytorch
Recurrent Memory Transformer - Pytorch
voicebox-pytorch
Voicebox - Pytorch
gradnorm-pytorch
GradNorm - Pytorch
nystrom-attention
Nystrom Attention - Pytorch
linformer
Linformer implementation in Pytorch
agent-attention-pytorch
Agent Attention - Pytorch
mirasol-pytorch
Mirasol - Pytorch
toolformer-pytorch
Toolformer - Pytorch
bidirectional-cross-attention
Bidirectional Cross Attention
parti-pytorch
Parti - Pathways Autoregressive Text-to-Image Model - Pytorch
tab-transformer-pytorch
Tab Transformer - Pytorch
med-seg-diff-pytorch
MedSegDiff - SOTA medical image segmentation - Pytorch
simple-hierarchical-transformer
Simple Hierarchical Transformer
egnn-pytorch
E(n)-Equivariant Graph Neural Network - Pytorch
metnet3-pytorch
MetNet 3 - Pytorch
gigagan-pytorch
GigaGAN - Pytorch
spear-tts-pytorch
Spear-TTS - Pytorch
retro-pytorch
RETRO - Retrieval Enhanced Transformer - Pytorch
pause-transformer
Pause Transformer
zorro-pytorch
Zorro - Pytorch
dalle2-pytorch
DALL-E 2
x-clip
X-CLIP
bit-diffusion
Bit Diffusion - Pytorch
complex-valued-transformer
Complex Valued Transformer / Attention
MaMMUT-pytorch
MaMMUT - Pytorch
CoCa-pytorch
CoCa, Contrastive Captioners are Image-Text Foundation Models - Pytorch
speculative-decoding
Speculative Decoding
FLASH-pytorch
FLASH - Transformer Quality in Linear Time - Pytorch
naturalspeech2-pytorch
Natural Speech 2 - Pytorch
perfusion-pytorch
Perfusion - Pytorch
musiclm-pytorch
MusicLM - AudioLM + Audio CLIP to text to music synthesis
TPDNE-utils
TPDNE
ETSformer-pytorch
ETSTransformer - Exponential Smoothing Transformer for Time-Series Forecasting - Pytorch
Mega-pytorch
Mega - Pytorch
perceiver-pytorch
Perceiver - Pytorch
mixture-of-experts
Sparsely-Gated Mixture of Experts for Pytorch
VN-transformer
Vector Neuron Transformer (VN-Transformer)
siren-pytorch
Implicit Neural Representations with Periodic Activation Functions
flash-attention-jax
Flash Attention - in Jax
memory-efficient-attention-pytorch
Memory Efficient Attention - Pytorch
memorizing-transformers-pytorch
Memorizing Transformer - Pytorch
discrete-key-value-bottleneck-pytorch
Discrete Key / Value Bottleneck - Pytorch
graph-transformer-pytorch
Graph Transformer - Pytorch
dalle-pytorch
DALL-E - Pytorch
coordinate-descent-attention
Coordinate Descent Attention - Pytorch
conformer
The convolutional module from the Conformer paper
rvq-vae-gpt
Yet another attempt at GPT in quantized latent space
memory-compressed-attention
Memory-Compressed Self Attention
perceiver-ar-pytorch
Perceiver AR
PaLM-rlhf-pytorch
PaLM + Reinforcement Learning with Human Feedback - Pytorch
gated-state-spaces-pytorch
Gated State Spaces - GSS - Pytorch
nuwa-pytorch
NÜWA - Pytorch
isab-pytorch
Induced Set Attention Block - Pytorch
einops-exts
Einops Extensions
adjacent-attention-pytorch
Adjacent Attention Network - Pytorch
n-grammer-pytorch
N-Grammer - Pytorch
se3-transformer-pytorch
SE3 Transformer - Pytorch
invariant-point-attention
Invariant Point Attention
flash-cosine-sim-attention
Flash Cosine Similarity Attention
flamingo-pytorch
Flamingo - Pytorch
adan-pytorch
Adan - (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch
lightweight-gan
Lightweight GAN
PaLM-pytorch
PaLM: Scaling Language Modeling with Pathways - Pytorch
alphafold2-pytorch
AlphaFold2 - Pytorch
metaformer-gpt
Metaformer - GPT
tranception-pytorch
Tranception - Pytorch
PaLM-jax
PaLM: Scaling Language Modeling with Pathways - Jax
tf-bind-transformer
Transformer for Transcription Factor Binding
compositional-attention-pytorch
Compositional Attention - Pytorch
anymal-belief-state-encoder-decoder-pytorch
Anymal Belief-state Encoder Decoder - Pytorch
resize-right
Resize Right
uniformer-pytorch
Uniformer - Pytorch
ddpm-proteins
Denoising Diffusion Probabilistic Models - for Proteins - Pytorch
protein-glm
Protein generative model with General Language Model PreTraining (GLM)
RQ-transformer
RQ Transformer - Autoregressive Transformer for Residual Quantized Codes
rela-transformer
ReLA Transformer
triton-transformer
Transformer in Triton
ITTR-pytorch
ITTR - Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block
logavgexp-pytorch
LogAvgExp - Pytorch
nwt-pytorch
NWT - Pytorch
deep-daze
Deep Daze
point-transformer-pytorch
Point Transformer - Pytorch
electra-pytorch
Electra - Pytorch
performer-pytorch
Performer - Pytorch
mlm-pytorch
MLM (Masked Language Modeling) - Pytorch
big-sleep
Big Sleep
transformer-in-transformer
Transformer in Transformer - Pytorch
hourglass-transformer-pytorch
Hourglass Transformer
routing-transformer
Routing Transformer (Pytorch)
reformer-pytorch
Reformer, the Efficient Transformer, Pytorch
ponder-transformer
Ponder Transformer - Pytorch
uformer-pytorch
Uformer - Pytorch
jax2torch
Jax 2 Torch
compressive-transformer-pytorch
Implementation of Compressive Transformer in Pytorch
remixer-pytorch
Remixer - Pytorch
bottleneck-transformer-pytorch
Bottleneck Transformer - Pytorch
htm-pytorch
Hierarchical Transformer Memory - Pytorch
linear-attention-transformer
Linear Attention Transformer
tr-rosetta-pytorch
trRosetta - Pytorch
axial-attention
Axial Attention
fast-transformer-pytorch
Fast Transformer - Pytorch
timesformer-pytorch
TimeSformer - Pytorch
segformer-pytorch
Segformer - Pytorch
token-shift-gpt
Token Shift GPT - Pytorch
progen-transformer
Protein Generation (ProGen)
g-mlp-pytorch
gMLP - Pytorch
protein-bert-pytorch
ProteinBERT - Pytorch
sinkhorn-transformer
Sinkhorn Transformer - Sparse Sinkhorn Attention
long-short-transformer
Long Short Transformer - Pytorch
triangle-multiplicative-module
Triangle Multiplicative Module
multistream-transformers
Multistream Transformers - Pytorch
charformer-pytorch
Charformer - Pytorch
mlp-gpt-jax
MLP GPT - Jax
geometric-vector-perceptron
Geometric Vector Perceptron - Pytorch
res-mlp-pytorch
ResMLP - Pytorch
local-attention-flax
Local Attention - Flax Module in Jax
g-mlp-gpt
gMLP - GPT
poolformer
Poolformer
transganformer
TransGanFormer
stam-pytorch
Space Time Attention Model (STAM) - Pytorch
cross-transformers-pytorch
Cross Transformers - Pytorch
glom-pytorch
Glom - Pytorch
halonet-pytorch
HaloNet - Pytorch
omninet-pytorch
Omninet - Pytorch
contrastive-learner
Self-supervised contrastive learning made simple
coco-lm-pytorch
COCO - Pytorch
feedback-transformer-pytorch
Implementation of Feedback Transformer in Pytorch
pi-gan-pytorch
π-GAN - Pytorch
lie-transformer-pytorch
Lie Transformer - Pytorch
pixel-level-contrastive-learning
Pixel-Level Contrastive Learning
marge-pytorch
Marge - Pytorch
dalle-pytorch-dev
DALL-E - Pytorch
esbn-pytorch
Emergent Symbol Binding Network - Pytorch
molecule-attention-transformer
Molecule Attention Transformer - Pytorch
gsa-pytorch
Global Self-attention Network (GSA) - Pytorch
lambda-networks
Lambda Networks - Pytorch
memformer
Memformer - Pytorch
hamburger-pytorch
Hamburger - Pytorch
unet-stylegan2
StyleGan2 with UNet Discriminator, in Pytorch
aoa-pytorch
Attention on Attention - Pytorch
deep-linear-network
Deep Linear Network - Pytorch
kronecker-attention-pytorch
Kronecker Attention - Pytorch
attention-tensorflow-mesh
A bunch of attention related functions, for constructing transformers in tensorflow mesh
memory-transformer-xl
Memory Transformer-XL, a variant of Transformer-XL that uses linear attention update long term memory
scattering-transform
Scattering Transform module from the paper Scattering Compositional Learner
relay-transformer
Relay Transformer, a long-range transformer
axial-positional-embedding
Axial Positional Embedding
linear-attention
Linear Attention Transformer