Last released Feb 16, 2025
A light weight framework to generate high performance CUDA/HIP code for BLAS operators.
Supported by