6 projects
eai-sparsify
Sparsify transformers with SAEs and transcoders
eai-delphi
Automated Interpretability
concept-erasure
Erasing concepts from neural representations with provable guarantees
eleuther-elk
Keeping language models honest by directly eliciting knowledge encoded in their activations
tokengrams
Efficiently computing & storing token n-grams from large corpora
bergson
Tracing the memory of neural nets with data attribution