Last released Nov 6, 2024
REFT: Representation Finetuning for Language Models
Use Activation Intervention to Interpret Causal Mechanism of Model
Supported by