Attribution graphs and circuit tracing for language model interpretability
Project description
circuit-tracer
Find, visualize, and intervene on circuits inside language models using features from (cross-layer) MLP transcoders, as introduced by Ameisen et al. (2025) and Lindsey et al. (2025).
What does it do?
- Attribution: Given a model with pre-trained transcoders, compute the direct effect that each non-zero transcoder feature, error node, and input token has on each other feature and output logit — producing a full circuit / attribution graph.
- Visualization: View and annotate the resulting graph in an interactive browser UI (the same frontend used in the original papers).
- Intervention: Set transcoder features to arbitrary values and observe how model output changes, validating the circuits you discover.
Installation
From PyPI (recommended)
pip install circuit-tracer
Or with uv:
uv add circuit-tracer
Optional extras
# Visualization dependencies (seaborn, ipykernel, ipywidgets)
pip install circuit-tracer[viz]
# Everything (viz + dev tools)
pip install circuit-tracer[all]
From source
git clone https://github.com/Caerii/circuit-tracer.git
cd circuit-tracer
uv sync --group dev
Quick Start
from circuit_tracer import ReplacementModel, Graph, attribute
# Load a model with transcoders
model = ReplacementModel.from_pretrained("google/gemma-2-2b", "gemma")
# Find the circuit for a prompt
graph = attribute(model, "The capital of France is")
# Save for later analysis
graph.to_pt("france_capital.pt")
# Or load an existing graph
graph = Graph.from_pt("france_capital.pt")
For a complete walkthrough, try the tutorial notebook!
Getting Started
There are three ways to use circuit-tracer:
-
On Neuronpedia (no install needed): Use
circuit-tracerdirectly on Neuronpedia. Click+ New Graphto create your own, or browse existing graphs from the drop-down. -
Python script / Jupyter notebook: Start with the tutorial notebook (runs on free Colab GPUs). See the Demos section below.
-
Command-line interface: Run the full attribution-to-visualization pipeline from your terminal. See CLI Usage below.
Working with Gemma-2 (2B) is possible with relatively limited GPU resources; Colab GPUs have 15GB of RAM. More GPU RAM allows less offloading and larger batch sizes.
Supported Models & Transcoders
The following transcoders are available. Use the HuggingFace repo name as the transcoders argument of ReplacementModel.from_pretrained, or as --transcoder_set in the CLI.
| Model | Transcoder Type | HuggingFace Repo | Notes |
|---|---|---|---|
| Gemma-2 (2B) | PLT | mntss/gemma-scope-transcoders |
Originally from GemmaScope |
| Gemma-2 (2B) | CLT (426K) | mntss/clt-gemma-2-2b-426k |
|
| Gemma-2 (2B) | CLT (2.5M) | mntss/clt-gemma-2-2b-2.5M |
|
| Llama-3.2 (1B) | PLT | mntss/transcoder-Llama-3.2-1B |
|
| Llama-3.2 (1B) | CLT | mntss/clt-llama-3.2-1b-524k |
|
| Qwen-3 | PLT | 0.6B, 1.7B, 4B, 8B, 14B | |
| GPT-OSS (20B) | CLT | mntss/clt-131k |
|
| Gemma-3 | PLT | Collection | 270M to 27B, PT & IT. Requires nnsight backend. |
Choosing a Backend
By default, circuit-tracer uses the TransformerLens backend (inherits from HookedTransformer). For models not supported by TransformerLens, use the NNSight backend:
# TransformerLens (default) — fast, memory-efficient
model = ReplacementModel.from_pretrained("google/gemma-2-2b", "gemma")
# NNSight — supports most HuggingFace models
model = ReplacementModel.from_pretrained("google/gemma-3-4b-pt", "gemma-3", backend="nnsight")
Note: The NNSight backend is still experimental: it is slower and less memory-efficient, and may not provide all of the functionality of the TransformerLens version.
Demos
All demos live in the demos/ folder. The main tutorial can run on free Colab GPUs.
| Notebook | Description |
|---|---|
| circuit_tracing_tutorial.ipynb | End-to-end tutorial replicating findings from Lindsey et al. |
| attribute_demo.ipynb | How to find circuits and visualize them |
| attribution_targets_demo.ipynb | Specifying custom attribution targets (specific logits or directions) |
| intervention_demo.ipynb | Performing feature interventions on models |
| gemma_demo.ipynb | Pre-annotated Gemma-2 (2B) graphs with interventions |
| gemma_it_demo.ipynb | Instruction-tuned Gemma-2 (2B) with base-model transcoders |
| llama_demo.ipynb | Llama 3.2 (1B) graphs (requires local GPU, not Colab) |
Caching
To use lazy_decoder and lazy_encoder options on transcoders, they must be in circuit-tracer-compatible format. Rather than downloading pre-converted weights, you can build a local cache:
from circuit_tracer.utils.caching import save_transcoders_to_cache
save_transcoders_to_cache(
"mwhanna/gemma-scope-2-27b-pt/transcoder_all/width_262k_l0_small",
cache_dir="~/.cache/",
)
Clear the cache with circuit_tracer.utils.caching.empty_cache.
CLI Usage
The CLI runs the complete 3-step pipeline: attribution -> graph file creation -> visualization server.
Basic Usage
circuit-tracer attribute \
--prompt "The International Advanced Security Group (IAS" \
--transcoder_set gemma \
--slug gemma-demo \
--graph_file_dir ./graph_files \
--server
The server URL (e.g. localhost:8041) will be printed. If running on a remote machine, enable port forwarding to view the graphs locally.
Attribution only (save raw graph)
circuit-tracer attribute \
--prompt "The capital of France is" \
--transcoder_set llama \
--graph_output_path france_capital.pt
CLI Arguments
Required
| Argument | Description |
|---|---|
--prompt (-p) |
Input prompt to analyze |
--transcoder_set (-t) |
Transcoders to use (HuggingFace repo ID, or shortcut: gemma, llama) |
Plus at least one output destination:
--slug+--graph_file_dir(for visualization), and/or--graph_output_path(-o) (for raw.ptgraph)
Optional
| Argument | Default | Description |
|---|---|---|
--model (-m) |
auto | Model name (auto-inferred for gemma/llama presets) |
--max_n_logits |
10 | Maximum logit nodes to attribute from |
--desired_logit_prob |
0.95 | Cumulative probability threshold for top logits |
--batch_size |
256 | Batch size for backward passes |
--max_feature_nodes |
7500 | Maximum feature nodes |
--dtype |
model default | Load dtype (float32/fp32, float16/fp16, bfloat16/bf16) |
--offload |
None | Memory optimization (cpu, disk, or None) |
--node_threshold |
0.8 | Node pruning: keep nodes with cumulative influence >= threshold |
--edge_threshold |
0.98 | Edge pruning: keep edges with cumulative influence >= threshold |
--port |
8041 | Local server port |
--server |
false | Start visualization server after attribution |
--verbose |
false | Show detailed progress |
Graph Annotation
When using --server, the browser opens an interactive visualization:
- Select a node: Click on it
- Pin/unpin to subgraph pane: Ctrl+click (or Cmd+click)
- Annotate a node: Click "Edit" on the right side while a node is selected
- Group nodes into a supernode: Hold G and click on nodes
- Ungroup a supernode: Hold G and click the x next to it
- Annotate a supernode: Click on the label below it
Interventions are also available on Neuronpedia for Gemma-2 (2B): pin at least one node, then click "Steer" in the subgraph.
Project Structure
circuit_tracer/
├── __init__.py # Public API: attribute, Graph, ReplacementModel
├── _version.py # Single source of truth for version
├── graph.py # Graph data structures, pruning, influence computation
├── attribution/
│ ├── attribute.py # Unified attribution engine (both backends)
│ └── targets.py # LogitTarget, CustomTarget, AttributionTargets
├── replacement_model/
│ ├── common.py # Shared utilities (tokenization, interventions)
│ ├── replacement_model_nnsight.py
│ └── replacement_model_transformerlens.py
├── transcoder/
│ ├── cross_layer_transcoder.py
│ └── single_layer_transcoder.py
├── frontend/ # Visualization server and graph models
└── utils/ # Config mapping, caching, feature decoding, etc.
Contributing
See CONTRIBUTING.md for setup instructions and development workflow. We use uv for dependency management.
# Quick dev setup
git clone https://github.com/Caerii/circuit-tracer.git
cd circuit-tracer
uv sync --group dev
# Quality checks
uv run ruff format && uv run ruff check && uv run pyright && uv run pytest tests -q
Active Maintainers
Cite
@misc{circuit-tracer,
author = {Hanna, Michael and Piotrowski, Mateusz and Lindsey, Jack and Ameisen, Emmanuel},
title = {circuit-tracer},
howpublished = {\url{https://github.com/decoderesearch/circuit-tracer}},
note = {The first two authors contributed equally and are listed alphabetically.},
year = {2025}
}
Or cite the paper here.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file circuit_tracer-0.5.0.tar.gz.
File metadata
- Download URL: circuit_tracer-0.5.0.tar.gz
- Upload date:
- Size: 123.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bcf5a045185f389d8bc7d22e056e8448ee32fb9d0d46b2f919bf6c7ef4c70cc8
|
|
| MD5 |
4c963f349ee594266c100d587036767e
|
|
| BLAKE2b-256 |
776cbab9adcbead5a621666d97755ecc7253d853dcc300cc0aeadb01efaad917
|
Provenance
The following attestation bundles were made for circuit_tracer-0.5.0.tar.gz:
Publisher:
publish.yml on Caerii/circuit-tracer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
circuit_tracer-0.5.0.tar.gz -
Subject digest:
bcf5a045185f389d8bc7d22e056e8448ee32fb9d0d46b2f919bf6c7ef4c70cc8 - Sigstore transparency entry: 1191851124
- Sigstore integration time:
-
Permalink:
Caerii/circuit-tracer@4ee03d1c1deb40c3c25f4af43607f11ca81498c9 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Caerii
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4ee03d1c1deb40c3c25f4af43607f11ca81498c9 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file circuit_tracer-0.5.0-py3-none-any.whl.
File metadata
- Download URL: circuit_tracer-0.5.0-py3-none-any.whl
- Upload date:
- Size: 153.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e457e6558e1c9eb0d3d7c54ce2298492b4e919a4f7f2b8b7e217427284fdc802
|
|
| MD5 |
c651a85f0f275f875aa1bc3171b781b1
|
|
| BLAKE2b-256 |
7fe8100af8e8738be19d1edc50b4bdf26083e2489e52b9eb9bfc38488b0ef151
|
Provenance
The following attestation bundles were made for circuit_tracer-0.5.0-py3-none-any.whl:
Publisher:
publish.yml on Caerii/circuit-tracer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
circuit_tracer-0.5.0-py3-none-any.whl -
Subject digest:
e457e6558e1c9eb0d3d7c54ce2298492b4e919a4f7f2b8b7e217427284fdc802 - Sigstore transparency entry: 1191851146
- Sigstore integration time:
-
Permalink:
Caerii/circuit-tracer@4ee03d1c1deb40c3c25f4af43607f11ca81498c9 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Caerii
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4ee03d1c1deb40c3c25f4af43607f11ca81498c9 -
Trigger Event:
workflow_dispatch
-
Statement type: