Attach custom heads to transformer models.
Project description
Transformer Heads
This library aims to be an allround toolkit for attaching, training, saving and loading of new heads for transformer models.
A new head could be:
- A linear probe used to get an understanding of the information processing in a transformer architecture
- A head to be finetuned jointly with the weights of a pretrained transformer model to perform a completely different kind of task.
- E.g. a transformer pretrained to do causal language modelling could get a sequence classification head attached and be finetuned to do sentiment classification.
- Or one could attach a regression head to turn a large language model into a value function for a reinforcement learning problem.
On top of that, attaching multiple heads at once can make multi-task learning easy, making it possible to train very general models.
Installation
From the root of this repository:
pip install -e .
Usage
Create head configurations
head_config = HeadConfig(
name=f"imdb_head_3",
layer_hook=-3,
in_size=hidden_size,
output_activation="linear",
pred_for_sequence=True,
loss_fct="cross_entropy",
num_outputs=2,
)
Create a model with your head from a pretrained transformer model
model = load_headed(
LlamaForCausalLM,
"meta-llama/Llama-2-7b-hf",
head_configs=[heads_config],
)
Train you model using (for example) the simple to use huggingface Trainer interface:
trainer = Trainer(
model,
args=args,
train_dataset=imdb_dataset["train"],
data_collator=collator,
)
For a more in-depth introduction and a fully working example, check the linear probe notebook.
Joint training of multiple linear probes
Notebooks
This repository contains multiple jupyter notebooks for a tutorial/illustration of how do do certain things with this library. Here is an overview of which notebook you should check out depending on the use you are interested in.
- Linear Probes (understanding the inner workings of transformers)
- Basic example with one probe for causal LM: notebooks/gpt2/linear_probe.ipynb
- Train many probes for causal LM at once: notebooks/gpt2/multi_linear_probe.ipynb
- Train many probes for text classification once: notebooks/gpt2/text_classification_linear_probe.ipynb
- Finetuning on a new type of task (with a new head)
- QLoRA: notebooks/gpt2/text_classification_qlora.ipynb
- Full finetuning: notebooks/gpt2/text_classification_full_finetune.ipynb
- Joint multi-task learning
- Many heads doing completely different tasks + QLoRA, all trained at the same time: notebooks/gpt2/joint_multitask_learning.ipynb
- Regression with pretrained transformers
- Check the regression heads of this notebook: notebooks/gpt2/joint_multitask_learning.ipynb
- Saving and loading
Joint multi-task training with different types of heads and QLoRA.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for transformer_heads-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ff881552fa42fc6e9c8d983e88d3593a23eed8400ca5df9d7fee176e8cd55736 |
|
MD5 | aecb0af22c6aeb5a8529d481aec2a339 |
|
BLAKE2b-256 | 7fdc8c3238c8d8135b9b47a5b001fdd65652ff9f5f3f5793c2ea3864a321694c |