A Python package that uses PyTorch, Transformers, and NumPy
Project description
Pay attention pipeline
This repository provides a pipeline for computing Influence and performing Generation with GUIDE to enhance transformer model explainability and performance. The pipeline leverages attention scores and embedding vectors to assess the importance of specific subsequences and applies various levels of instruction enhancement to improve model responses.
Features
- Influence Calculation: Assess the impact of specific subsequences on the model's predictions using attention scores and embedding vectors.
- Generation with GUIDE: Use guided instruction to generate more accurate and contextually relevant outputs from transformer models.
GUIDE
GUIDE (Guided Understanding with Instruction-Driven Enhancements) is a novel and systematic approach that enables users to highlight critical instructions within the text input provided to an LLM. This pipeline implements GUIDE and enables users to influence the attention given to specific tokens by simply enclosing important text within tags like <!-> <-!>
(as shown below). We propose to achieve this by simply adding a bias, denoted by $\Delta$, to the attention logits of the important tokens, i.e., $\bar{w}{k,i}^{(\ell)} = w{k,i}^{(\ell)} + \Delta,$ for all tokens indicated by the user, as shown by the attention matrices below, where each entry represents the impact of a past token (x-axis) on the ongoing token (y-axis).
Installation
To set up the environment for this project, follow these steps:
-
Clone the Repository
git clone https://github.com/netop-team/pay_attention_pipeline.git cd pay_attention_pipeline
-
Create and Activate a Virtual Environment
python -m venv env source env/bin/activate # On Windows use `env\Scripts\activate`
-
Install Required Packages
pip install -r requirements.txt
Usage
Normally, one would load a pipeline using the Hugging Face pipeline as shown below:
from transformers import pipeline
pipe = pipeline(
"text-generation",
model="your_model_name",
)
prompt = '''
The Eiffel Tower, an iconic symbol of Paris and France, was completed in 1889 as the centerpiece of the Exposition Universelle, a world’s fair celebrating the centennial of the French Revolution...
'''
out = pipe("Rewrite in French" + prompt, max_new_tokens = 100)
However, with this repository, you can use our custom PayAttentionPipeline
to take advantage of the specialized features provided here: GUIDE and Influence.
If your prompt does not contain the tags <?-> <-?>
, <!-> <-!>
, <!!-> <-!!>
or <!!!-> <-!!!>
, our pipeline works exactly the same as HuggingFace's one
Influence Calculation
The influence metric assesses the importance of a subsequence in the context of the model's predictions. Here’s how to compute it:
-
Load the Pipeline
from transformers import pipeline from pipeline.pay_attention_pipeline import PayAttentionPipeline pipe = pipeline( "pay-attention", model="mistralai/Mistral-7B-Instruct-v0.1", )
-
Compute Influence
Add the tag
<?-> <-?>
around the text you want to compute the influence for. For example:prompt = ''' The Eiffel Tower, an iconic symbol of Paris and France, was completed in 1889 as the centerpiece of the Exposition Universelle, a world’s fair celebrating the centennial of the French Revolution... ''' out1 = pipe("<?-> Rewrite in French <-?>" + prompt, max_new_tokens=100) out2 = pipe("<?-> REWRITE IN FRENCH <-?>" + prompt, max_new_tokens=100) influence = out1['influence'] influence_caps = out2['influence']
You can visualize the influence of different layers as follows:
import torch import torch.nn.functional as F import matplotlib.pyplot as plt def rolling_mean(x, window_size): # (Function implementation) layers_to_plot = [0, 15, 31] layers_to_axs_idx = {v: i for i, v in enumerate(layers_to_plot)} n_plots = len(layers_to_plot) fig, axes = plt.subplots(1, n_plots, figsize=(n_plots * 5, 4)) for layer_idx in layers_to_plot: plot_idx = layers_to_axs_idx[layer_idx] axes[plot_idx].plot( rolling_mean(torch.log(influence[layer_idx]), 10)[10:], label="Normal" ) axes[plot_idx].plot( rolling_mean(torch.log(influence_caps[layer_idx]), 10)[10:], label="Uppercase" ) axes[plot_idx].set_title(f"Layer {layer_idx+1}") axes[plot_idx].grid() axes[plot_idx].set_xlabel("context length") axes[plot_idx].set_ylabel("log influence") axes[plot_idx].legend()
Generation with GUIDE
GUIDE (Guided Understanding with Instruction-Driven Enhancements) improves the model's response based on instruction levels. Here’s how to use it:
-
Load the Generative Model
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline pipe = pipeline( "pay-attention", model="mistralai/Mistral-7B-Instruct-v0.1", )
-
Apply GUIDE Levels
Enhance the generation using various levels of instruction:
message_1 = [{'role': 'user', 'content': "<!-> Summarize in French <-!>" + prompt}] out_1 = pipe(message_1, max_new_tokens=100) message_2 = [{'role': 'user', 'content': "<!!-> Summarize in French <-!!>" + prompt}] out_2 = pipe(message_2, max_new_tokens=100) message_3 = [{'role': 'user', 'content': "<!!!-> Summarize in French <-!!!>" + prompt}] out_3 = pipe(message_3, max_new_tokens=100)
Adjust the enhancement values as needed for your task.
-
Customizing (Optional)
To experiment with other values of delta, set
delta_mid
:dumb_pipe = pipeline( "pay-attention", model=base_model, tokenizer=tokenizer, model_kwargs=dict(cache_dir="/Data"), **dict(delta_mid=10) ) message = [{'role': 'user', 'content': "<!!-> Summarize in French <-!!>" + prompt}] out = dumb_pipe(message, max_new_tokens=100)
License
This project is licensed under the MIT License.
Contributing
Contributions are welcome! Please submit a pull request or open an issue to discuss potential improvements.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pay_attention_pipeline-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | df86262955029df649d9db41b350e6e12b83165423239d73d94e066c67d57e20 |
|
MD5 | 64814c44a26ea604e3afce5bb59365fb |
|
BLAKE2b-256 | 05026e8a5222de46af593fa10f2b7ffd5c6007ed828f03081a113d78a1ed1a5c |
Hashes for pay_attention_pipeline-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f701c76475099e86306fd9f20ffa40616acc58ee77bb63d9fe070570e62ac7d3 |
|
MD5 | ee29964751c517c37b57d0e57f95a8c1 |
|
BLAKE2b-256 | 10e2bc9c11075e3eed860bd6304a77a3c2245aaaadb0a7c45407dce4800002b4 |