Last released Aug 23, 2024
A light weight framework to generate high performance CUDA/HIP code for BLAS operators.
Supported by