Last released May 26, 2025
Use Activation Intervention to Interpret Causal Mechanism of Model
Last released Feb 4, 2025
REFT: Representation Finetuning for Language Models
Supported by