lxt · PyPI

LRP explains Transformers

Project description

Layer-wise Relevance Propagation for Transformers

Fast, faithful explanations for transformer models with a single backward pass

What is LXT?
Getting Started
Supported Models
How LXT Works
Documentation
Citation

Accelerating eXplainable AI research for LLMs & ViTs

✨ What is LXT?

LXT makes black-box transformer models explainable by precisely revealing how much each input token and individual neuron contribute to the final prediction logit. Unlike standard gradient-based methods, which can be noisy or unreliable, LXT delivers faithful attributions using AttnLRP, a backpropagation-based technique that corrects gradient flow through non-linearities. Best of all, it requires only a single backward pass.

See the dramatic improvement in explanation quality on Gemma 3 (4B):

Input×Gradient (Traditional)	AttnLRP (Our Method)
Noisy, scattered attributions	Clean, semantically coherent attributions

🔥 Highly efficient & Faithful Attributions

Attention-aware LRP (AttnLRP) outperforms gradient-, decomposition- and perturbation-based methods, provides faithful attributions for the entirety of a black-box transformer model while scaling in computational complexity $O(1)$ and memory requirements $O(\sqrt{N})$ with respect to the number of layers.

🔎 Latent Feature Attribution & Visualization

Since we get relevance values for each single neuron in the model as a by-product, we know exactly how important each neuron is for the prediction of the model. Combined with Activation Maximization, we can label neurons or SAE features in LLMs and even steer the generation process of the LLM by activating specialized knowledge neurons in latent space!

📚 Paper

For the mathematical details and foundational work, please take a look at our paper:
Achtibat, et al. “AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers.” ICML 2024.

🏆 Hall of Fame

A small collection of papers that have utilized LXT:

📄 License

This project is licensed under the BSD-3 Clause License, which means that LRP is a patented technology that can only be used free of charge for personal and scientific purposes.

Getting Started

🛠️ Installation

pip install lxt

Tested with: transformers==4.52.4, torch==2.6.0, python==3.11

🚀 Quickstart with 🤗 LLaMA & many more

You find example scripts in the examples/* directory. For an in-depth tutorial, take a look at the Quickstart in the Documentation.

To get an overview, you can keep reading below ⬇️

🧩 Supported Models

Model Family	Status
🦙 LLaMA 2/3	✅
✨ Gemma 3	✅
🤖 Qwen 2	✅
🧠 Qwen 3	🧪 Attribution skewed toward first token
🔤 BERT	✅
🤖 GPT-2	✅ Best paired with contrastive explanations
🎨 Vision Transformers	✅

How LXT Works

Layer-wise Relevance Propagation is a rule-based backpropagation algorithm. This means, that we can implement LRP in a single backward pass! For this, LXT offers two different approaches:

1. Efficient Implementation

Uses a Input*Gradient formulation, which simplifies LRP to a standard & fast gradient computation via monkey patching the model class.

from lxt.efficient import monkey_patch

# Patch module first
monkey_patch(your_module)

# Forward pass with gradient tracking
outputs = model(inputs_embeds=input_embeds.requires_grad_())

# Backward pass
outputs.logits[...].backward()

# Get relevance at *ANY LAYER* in your model. Simply multiply the activation * gradient!
# here for the input embeddings:
relevance = (input_embeds * input_embeds.grad).sum(-1)

This is the recommended approach for most users as it's significantly faster and easier to use. This implementation technique is introduced in Arras, et al. “Close Look at Decomposition-based XAI-Methods for Transformer Language Models.” arXiv preprint, 2025.

2. Mathematical Explicit Implementation

This was used in the original ICML 2024 paper. It's more complex and slower, but useful for understanding the mathematical foundations of LRP.

To achieve this, we have implemented custom PyTorch autograd Functions for commonly used operations in transformers. These functions behave identically in the forward pass, but substitute the gradient with LRP attributions in the backward pass. To compute the $\varepsilon$-LRP rule for a linear function $y = W x + b$, you can simply write

import lxt.explicit.functional as lf

y = lf.linear_epsilon(x.requires_grad_(), W, b)
y.backward(y)

relevance = x.grad

There are also "super-functions" that wrap an arbitrary nn.Module and compute LRP rules via automatic vector-Jacobian products! These rules are simple to attach to models:

from lxt.explicit.core import Composite
import lxt.explicit.rules as rules

model = nn.Sequential(
  nn.Linear(10, 10),
  RootMeanSquareNorm(),
)

Composite({
  nn.Linear: rules.EpsilonRule,
  RootMeanSquareNorm: rules.IdentityRule,
}).register(model)

print(model)

Documentation

Click here to read the documentation.

Contribution

Feel free to explore the code and experiment with different datasets and models. We encourage contributions and feedback from the community. We are especially grateful for providing support for new model architectures! 🙏

Citation

@InProceedings{pmlr-v235-achtibat24a,
  title = {{A}ttn{LRP}: Attention-Aware Layer-Wise Relevance Propagation for Transformers},
  author = {Achtibat, Reduan and Hatefi, Sayed Mohammad Vakilzadeh and Dreyer, Maximilian and Jain, Aakriti and Wiegand, Thomas and Lapuschkin, Sebastian and Samek, Wojciech},
  booktitle = {Proceedings of the 41st International Conference on Machine Learning},
  pages = {135--168},
  year = {2024},
  editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = {235},
  series = {Proceedings of Machine Learning Research},
  month = {21--27 Jul},
  publisher = {PMLR}
}

Acknowledgements

The code is heavily inspired by Zennit, a tool for LRP attributions in PyTorch using hooks. Zennit is 100% compatible with the explicit version of LXT and offers even more LRP rules 🎉

Project details

Release history Release notifications | RSS feed

This version

2.1

Jul 10, 2025

2.0

Mar 19, 2025

0.6.1

Nov 11, 2024

0.6.0

Jul 17, 2024

0.5.9

Jul 4, 2024

0.5.2

Apr 17, 2024

0.5.1

Apr 11, 2024

0.5

Apr 5, 2024

0.1

Apr 2, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lxt-2.1.tar.gz (48.6 kB view details)

Uploaded Jul 10, 2025 Source

File details

Details for the file lxt-2.1.tar.gz.

File metadata

Download URL: lxt-2.1.tar.gz
Upload date: Jul 10, 2025
Size: 48.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for lxt-2.1.tar.gz
Algorithm	Hash digest
SHA256	`b1a62da2195d3de54ae03b3458e1135a9b1f7a84bab8416ecd727bb5e66894b4`
MD5	`c29ddbeeaff53f95ad59254fbf1e3c43`
BLAKE2b-256	`bf6e1e58a25454d74c0590c755995d9a65cf262ba33ddc1965a32cea85945603`

See more details on using hashes here.

lxt 2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta