Large autoregressive language modeling helpers
Project description
transformer-utils
Utilities for the HuggingFace transformer library, focused on loading and using large pretrained autoregressive language models like GPT-2 and GPT-Neo.
This package is unofficial and not associated with HuggingFace.
- Lets you load large (~2.7B) models in low-resource environments like Google Colab
- Interpreting models with the "logit lens"
- For background, see
- "interpreting GPT: the logit lens" by nostalgebraist
- "Finding the Words to Say: Hidden State Visualizations for Language Models" by Jay Alammar
- For background, see
Example usage
Loading in low-memory environments
Loading a 2.7B model:
from transformer_utils.low_memory import enable_low_memory_load
enable_low_memory_load()
model = transformers.AutoModelForCausalLM.from_pretrained('EleutherAI/gpt-neo-2.7B')
This works fine in an ordinary (non-Pro) Google Colab notebook, with ~12 GB RAM and a T5 GPU.
Inference will work up to the full context window length of 2048 tokens without memory issues.
Logit lens
import torch
import transformers
from transformer_utils.low_memory import enable_low_memory_load
enable_low_memory_load()
tokenizer = transformers.AutoTokenizer.from_pretrained("gpt2")
model = transformers.AutoModelForCausalLM.from_pretrained('gpt2-xl')
def text_to_input_ids(text):
toks = tokenizer.encode(text)
return torch.as_tensor(toks).view(1, -1).cuda()
input_ids = text_to_input_ids("This is an example. You can probably think of a more fun text to use than this one.")
plot_logit_lens(model, tokenizer, input_ids, start_ix=0, end_ix=45) # logits
plot_logit_lens(model, tokenizer, input_ids, start_ix=0, end_ix=45, probs=True) # probabilities
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for transformer_utils-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5704025ced0f7d64c08593193b6074e75460bb58621edc63daeff5fb1d051f0 |
|
MD5 | a4eb9b7e4da2b88782a362c0ccad8bb6 |
|
BLAKE2b-256 | 4397313ae2950b1906e6e04ba8e94589381fac8de3bbc85554579b94bd8c88be |