Skip to main content

Attribution graphs and circuit tracing for language model interpretability

Project description

circuit-tracer

PyPI version Python 3.10+ License: MIT CI

Find, visualize, and intervene on circuits inside language models using features from (cross-layer) MLP transcoders, as introduced by Ameisen et al. (2025) and Lindsey et al. (2025).

What does it do?

  1. Attribution: Given a model with pre-trained transcoders, compute the direct effect that each non-zero transcoder feature, error node, and input token has on each other feature and output logit — producing a full circuit / attribution graph.
  2. Visualization: View and annotate the resulting graph in an interactive browser UI (the same frontend used in the original papers).
  3. Intervention: Set transcoder features to arbitrary values and observe how model output changes, validating the circuits you discover.

Installation

From PyPI (recommended)

pip install circuit-tracer

Or with uv:

uv add circuit-tracer

Optional extras

# Visualization dependencies (seaborn, ipykernel, ipywidgets)
pip install circuit-tracer[viz]

# Everything (viz + dev tools)
pip install circuit-tracer[all]

From source

git clone https://github.com/Caerii/circuit-tracer.git
cd circuit-tracer
uv sync --group dev

Quick Start

from circuit_tracer import ReplacementModel, Graph, attribute

# Load a model with transcoders
model = ReplacementModel.from_pretrained("google/gemma-2-2b", "gemma")

# Find the circuit for a prompt
graph = attribute(model, "The capital of France is")

# Save for later analysis
graph.to_pt("france_capital.pt")

# Or load an existing graph
graph = Graph.from_pt("france_capital.pt")

For a complete walkthrough, try the tutorial notebook!


Getting Started

There are three ways to use circuit-tracer:

  1. On Neuronpedia (no install needed): Use circuit-tracer directly on Neuronpedia. Click + New Graph to create your own, or browse existing graphs from the drop-down.

  2. Python script / Jupyter notebook: Start with the tutorial notebook (runs on free Colab GPUs). See the Demos section below.

  3. Command-line interface: Run the full attribution-to-visualization pipeline from your terminal. See CLI Usage below.

Working with Gemma-2 (2B) is possible with relatively limited GPU resources; Colab GPUs have 15GB of RAM. More GPU RAM allows less offloading and larger batch sizes.


Supported Models & Transcoders

The following transcoders are available. Use the HuggingFace repo name as the transcoders argument of ReplacementModel.from_pretrained, or as --transcoder_set in the CLI.

Model Transcoder Type HuggingFace Repo Notes
Gemma-2 (2B) PLT mntss/gemma-scope-transcoders Originally from GemmaScope
Gemma-2 (2B) CLT (426K) mntss/clt-gemma-2-2b-426k
Gemma-2 (2B) CLT (2.5M) mntss/clt-gemma-2-2b-2.5M
Llama-3.2 (1B) PLT mntss/transcoder-Llama-3.2-1B
Llama-3.2 (1B) CLT mntss/clt-llama-3.2-1b-524k
Qwen-3 PLT 0.6B, 1.7B, 4B, 8B, 14B
GPT-OSS (20B) CLT mntss/clt-131k
Gemma-3 PLT Collection 270M to 27B, PT & IT. Requires nnsight backend.

Choosing a Backend

By default, circuit-tracer uses the TransformerLens backend (inherits from HookedTransformer). For models not supported by TransformerLens, use the NNSight backend:

# TransformerLens (default) — fast, memory-efficient
model = ReplacementModel.from_pretrained("google/gemma-2-2b", "gemma")

# NNSight — supports most HuggingFace models
model = ReplacementModel.from_pretrained("google/gemma-3-4b-pt", "gemma-3", backend="nnsight")

Note: The NNSight backend is still experimental: it is slower and less memory-efficient, and may not provide all of the functionality of the TransformerLens version.


Demos

All demos live in the demos/ folder. The main tutorial can run on free Colab GPUs.

Notebook Description
circuit_tracing_tutorial.ipynb End-to-end tutorial replicating findings from Lindsey et al.
attribute_demo.ipynb How to find circuits and visualize them
attribution_targets_demo.ipynb Specifying custom attribution targets (specific logits or directions)
intervention_demo.ipynb Performing feature interventions on models
gemma_demo.ipynb Pre-annotated Gemma-2 (2B) graphs with interventions
gemma_it_demo.ipynb Instruction-tuned Gemma-2 (2B) with base-model transcoders
llama_demo.ipynb Llama 3.2 (1B) graphs (requires local GPU, not Colab)

Caching

To use lazy_decoder and lazy_encoder options on transcoders, they must be in circuit-tracer-compatible format. Rather than downloading pre-converted weights, you can build a local cache:

from circuit_tracer.utils.caching import save_transcoders_to_cache

save_transcoders_to_cache(
    "mwhanna/gemma-scope-2-27b-pt/transcoder_all/width_262k_l0_small",
    cache_dir="~/.cache/",
)

Clear the cache with circuit_tracer.utils.caching.empty_cache.


CLI Usage

The CLI runs the complete 3-step pipeline: attribution -> graph file creation -> visualization server.

Basic Usage

circuit-tracer attribute \
  --prompt "The International Advanced Security Group (IAS" \
  --transcoder_set gemma \
  --slug gemma-demo \
  --graph_file_dir ./graph_files \
  --server

The server URL (e.g. localhost:8041) will be printed. If running on a remote machine, enable port forwarding to view the graphs locally.

Attribution only (save raw graph)

circuit-tracer attribute \
  --prompt "The capital of France is" \
  --transcoder_set llama \
  --graph_output_path france_capital.pt

CLI Arguments

Required

Argument Description
--prompt (-p) Input prompt to analyze
--transcoder_set (-t) Transcoders to use (HuggingFace repo ID, or shortcut: gemma, llama)

Plus at least one output destination:

  • --slug + --graph_file_dir (for visualization), and/or
  • --graph_output_path (-o) (for raw .pt graph)

Optional

Argument Default Description
--model (-m) auto Model name (auto-inferred for gemma/llama presets)
--max_n_logits 10 Maximum logit nodes to attribute from
--desired_logit_prob 0.95 Cumulative probability threshold for top logits
--batch_size 256 Batch size for backward passes
--max_feature_nodes 7500 Maximum feature nodes
--dtype model default Load dtype (float32/fp32, float16/fp16, bfloat16/bf16)
--offload None Memory optimization (cpu, disk, or None)
--node_threshold 0.8 Node pruning: keep nodes with cumulative influence >= threshold
--edge_threshold 0.98 Edge pruning: keep edges with cumulative influence >= threshold
--port 8041 Local server port
--server false Start visualization server after attribution
--verbose false Show detailed progress

Graph Annotation

When using --server, the browser opens an interactive visualization:

  • Select a node: Click on it
  • Pin/unpin to subgraph pane: Ctrl+click (or Cmd+click)
  • Annotate a node: Click "Edit" on the right side while a node is selected
  • Group nodes into a supernode: Hold G and click on nodes
  • Ungroup a supernode: Hold G and click the x next to it
  • Annotate a supernode: Click on the label below it

Interventions are also available on Neuronpedia for Gemma-2 (2B): pin at least one node, then click "Steer" in the subgraph.


Project Structure

circuit_tracer/
├── __init__.py                  # Public API: attribute, Graph, ReplacementModel
├── _version.py                  # Single source of truth for version
├── graph.py                     # Graph data structures, pruning, influence computation
├── attribution/
│   ├── attribute.py             # Unified attribution engine (both backends)
│   └── targets.py               # LogitTarget, CustomTarget, AttributionTargets
├── replacement_model/
│   ├── common.py                # Shared utilities (tokenization, interventions)
│   ├── replacement_model_nnsight.py
│   └── replacement_model_transformerlens.py
├── transcoder/
│   ├── cross_layer_transcoder.py
│   └── single_layer_transcoder.py
├── frontend/                    # Visualization server and graph models
└── utils/                       # Config mapping, caching, feature decoding, etc.

Contributing

See CONTRIBUTING.md for setup instructions and development workflow. We use uv for dependency management.

# Quick dev setup
git clone https://github.com/Caerii/circuit-tracer.git
cd circuit-tracer
uv sync --group dev

# Quality checks
uv run ruff format && uv run ruff check && uv run pyright && uv run pytest tests -q

Active Maintainers

Cite

@misc{circuit-tracer,
  author = {Hanna, Michael and Piotrowski, Mateusz and Lindsey, Jack and Ameisen, Emmanuel},
  title = {circuit-tracer},
  howpublished = {\url{https://github.com/decoderesearch/circuit-tracer}},
  note = {The first two authors contributed equally and are listed alphabetically.},
  year = {2025}
}

Or cite the paper here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

circuit_tracer-0.5.0.tar.gz (123.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

circuit_tracer-0.5.0-py3-none-any.whl (153.2 kB view details)

Uploaded Python 3

File details

Details for the file circuit_tracer-0.5.0.tar.gz.

File metadata

  • Download URL: circuit_tracer-0.5.0.tar.gz
  • Upload date:
  • Size: 123.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for circuit_tracer-0.5.0.tar.gz
Algorithm Hash digest
SHA256 bcf5a045185f389d8bc7d22e056e8448ee32fb9d0d46b2f919bf6c7ef4c70cc8
MD5 4c963f349ee594266c100d587036767e
BLAKE2b-256 776cbab9adcbead5a621666d97755ecc7253d853dcc300cc0aeadb01efaad917

See more details on using hashes here.

Provenance

The following attestation bundles were made for circuit_tracer-0.5.0.tar.gz:

Publisher: publish.yml on Caerii/circuit-tracer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file circuit_tracer-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: circuit_tracer-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 153.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for circuit_tracer-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e457e6558e1c9eb0d3d7c54ce2298492b4e919a4f7f2b8b7e217427284fdc802
MD5 c651a85f0f275f875aa1bc3171b781b1
BLAKE2b-256 7fe8100af8e8738be19d1edc50b4bdf26083e2489e52b9eb9bfc38488b0ef151

See more details on using hashes here.

Provenance

The following attestation bundles were made for circuit_tracer-0.5.0-py3-none-any.whl:

Publisher: publish.yml on Caerii/circuit-tracer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page