Last released Mar 7, 2026
CUDA 12 accelerated backend for safetensors-streaming
Last released Mar 6, 2026
Stream safetensors files from HTTP/S3 directly to GPU memory
Supported by