Last released Sep 25, 2024
REFT: Representation Finetuning for Language Models
Last released Aug 24, 2024
Use Activation Intervention to Interpret Causal Mechanism of Model
Supported by