Skip to main content

Transformer token flow visualizer

Project description

token-trace

A tool and UI to construct prompt-centric views of SAE feature attributions.

Main functionality:

  • We use this tool to identify which SAE features have the most 'attribution' towards decreasing the model loss.
  • In combination with Neuronpedia, we can identify what each SAE feature represents; this then gives us a rough idea of what computation the model is performing.

This tool is a first step towards discovering information flow between the features / layers of a transformer

Installation

git clone https://github.com/interp-hack/token-trace.git
pip install -e .

Quickstart

from token_trace import compute_node_attribution

text = "When John and Mary went to the shops, John gave the bag to Mary"

df: pd.DataFrame = compute_node_attribution(
    model_name = "gpt2",
    text
)

Each row of df describes one node corresponding to an SAE feature or error term.

Front-end

We use Streamlit to create a UI. Start the app as follows:

streamlit run app/token_trace_app.py

Methodology

Under the hood, we use attribution patching to compute indirect effect of the loss with respect to SAE features. The method is adapted heavily from Sparse Feature Circuits.

Development

We use PDM to manage dependencies. Set up a development environment as follows:

pdm install # creates a .venv
source .venv/bin/activate

Once in the virtual environment, make sure to also install the pre-commit hooks

pre-commit install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_trace-0.1.0.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

token_trace-0.1.0-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file token_trace-0.1.0.tar.gz.

File metadata

  • Download URL: token_trace-0.1.0.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: pdm/2.14.0 CPython/3.10.12 Linux/6.5.0-1018-azure

File hashes

Hashes for token_trace-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8e1cdb899b0b314b375b9cd99cf3617358080cede04191f346a3ded6336414d3
MD5 517926e153bd828df7a31a43828bffe8
BLAKE2b-256 ccc6174bd5dd5ddb245b81374d175ea44494cde3916be8fabf7863579e035d65

See more details on using hashes here.

File details

Details for the file token_trace-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: token_trace-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: pdm/2.14.0 CPython/3.10.12 Linux/6.5.0-1018-azure

File hashes

Hashes for token_trace-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 063f144bdebedb3c94fe6d51f7962ced8fb143c38c7299e02a03061955c7c6cd
MD5 a90281a3de66b6c0a6605dc46f0e013c
BLAKE2b-256 8ecfed14d39693a2bdfefb381f56f50519d4e06bfef9918ed0d5b175614b03f1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page