Transformer token flow visualizer
Project description
Token-Trace
A tool and UI to construct prompt-centric views of SAE feature attributions.
Main functionality:
- We use this tool to identify which SAE features have the most 'attribution' towards decreasing the model loss.
- In combination with Neuronpedia, we can identify what each SAE feature represents; this then gives us a rough idea of what computation the model is performing.
This tool is a first step towards discovering information flow between the features / layers of a transformer
Quickstart
Installation
pip install token-trace
Example Usage
from token_trace import compute_node_attribution
text = "When John and Mary went to the shops, John gave the bag to Mary"
df: pd.DataFrame = compute_node_attribution(
model_name = "gpt2",
text
)
Each row of df
describes one node corresponding to an SAE feature or error term.
Visualizing SAE attribution statistics in frontend.
We use Streamlit to create a UI. Start the app as follows:
streamlit run app/token_trace_app.py
Methodology
Under the hood, we use attribution patching to compute indirect effect of the loss with respect to SAE features. The method is adapted heavily from Sparse Feature Circuits.
Development
We use PDM to manage dependencies. Set up a development environment as follows:
pdm install # creates a .venv
source .venv/bin/activate
Once in the virtual environment, make sure to also install the pre-commit hooks
pre-commit install
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file token_trace-0.3.0.tar.gz
.
File metadata
- Download URL: token_trace-0.3.0.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: pdm/2.15.1 CPython/3.10.12 Linux/6.5.0-1018-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eaf4dda9aa5350da3286098463fe4b2abb1f711fbd5b7a820d170ebca6302470 |
|
MD5 | 6f32ce17f9a55dd7f7901ca16d0ae5a0 |
|
BLAKE2b-256 | 78e3aa966ca9b6784cafdf35cc0baf558676ddcbac9e7b20276a5d7836d1484c |
File details
Details for the file token_trace-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: token_trace-0.3.0-py3-none-any.whl
- Upload date:
- Size: 22.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: pdm/2.15.1 CPython/3.10.12 Linux/6.5.0-1018-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f149055aabcebac07e7f92e4d18c4203dda73756610597f7ba7124e3b75664c |
|
MD5 | d30ec9e89b356eff6739ee0a475bd4fc |
|
BLAKE2b-256 | 3cfbb950e0c5d4dbec44dd24bbae7d2c9a0af3f183f033289c5b97de3384199d |