Last released Apr 22, 2025
https://arxiv.org/abs/2410.19313. COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Supported by