5 projects
warp-attention
Warp attention: hardware efficient implementation of scaled dot product attention.
torchpq
Efficient implementations of Product Quantization and its variants
sparse-dok
sparse dok tensor implementation
fast-pytorch-kmeans
a fast kmeans clustering algorithm implemented in pytorch
torchtimer
A profiling tool for pytorch