Last released Mar 26, 2026
LLM post-training playbook: SFT, GRPO, DPO, eval, and inference
Supported by