Last released Nov 21, 2023
CUDA and Triton implementations of Flash Attention with SoftmaxN.
Supported by