Last released Nov 2, 2024
The inference engine for PygmalionAI models
Last released Sep 1, 2024
Forward-only flash-attn with CUDA 12.4
Supported by