LLM Neuroanatomy Explorer — map what each transformer layer does
Project description
neuro-scan
LLM Neuroanatomy Explorer — map what each transformer layer does
Ecosystem
| Tool | What it does | Question it answers |
|---|---|---|
| layer-scan | Find optimal layer duplication config | What to do — which layers to duplicate |
| neuro-scan | Map what each layer does | Why it works — understand layer functions |
layer-scan users are neuro-scan's natural first users: understand your model's layers before you duplicate them.
Ablation Sensitivity
Layer ablation sensitivity for Qwen2-1.5B with math probe. Bars colored by auto-detected function (reasoning, syntax, etc.). Gold stars mark the most critical layers.
Logit Lens Trajectory
Logit lens heatmap showing when the correct answer token emerges across layers. Red diamonds mark the emergence point for each sample.
Features
- Layer Ablation — zero out each layer one-by-one, measure the score impact
- Logit Lens — project each layer's hidden state to vocabulary space, watch the answer emerge
- Tuned Lens — per-layer affine probes that reduce early-layer bias by 4-5 bits (Belrose 2023)
- Attention Entropy — quantify how focused or diffuse each attention head is
- Circuit Detection — find synergistic and redundant layer pairs via targeted pairwise ablation
- Block Influence — one forward pass to estimate all layers' importance (ShortGPT BI metric)
- Cross-probe Analysis — identify universal vs probe-specific important layers
- Multi-model Comparison — compare neuroanatomy across different models
- Auto Layer Labeling — automatically classify layers as early_processing, syntax, reasoning, formatting, or output
- Prompt Repetition Experiment — test whether repeating a prompt N times approximates duplicating K layers
- Interactive HTML Charts — Plotly-powered visualizations for all analysis types
- Pre-computed Fetch — download community neuroanatomy reports from HuggingFace Hub (no GPU needed)
Installation
# pipx (recommended, isolated env)
pipx install neuro-scan
# pip
pip install neuro-scan
# For pre-computed report fetch (no GPU required):
pip install neuro-scan[lookup]
Quick Start
# Full neuroanatomy map (recommended)
neuro-scan map --model <path-or-hf-id> --probe math
# Individual analyses
neuro-scan ablate --model <path> --probe math
neuro-scan logit-lens --model <path> --probe math
neuro-scan attention --model <path> --probe math
# Circuit detection
neuro-scan circuit --model <path> --probe math --strategy fast
# Cross-probe analysis
neuro-scan cross-probe --model <path> --probes "math,eq,json"
# Multi-model comparison
neuro-scan compare report_a.json report_b.json
# Tuned Lens (two-step workflow)
neuro-scan calibrate --model <path> --output lens.safetensors
neuro-scan logit-lens --model <path> --tuned-lens lens.safetensors
# Fetch pre-computed results (no GPU needed)
neuro-scan fetch --model Qwen/Qwen2-7B --probe math
# Prompt repetition experiment
neuro-scan prompt-repeat --model <path> --probe math --repeat-counts 1,2,3,4
# Utilities
neuro-scan probes
neuro-scan version
Commands
| Command | Description |
|---|---|
map |
Full neuroanatomy (ablation + logit lens + attention + labeling) |
ablate |
Layer ablation sensitivity scan |
logit-lens |
Logit lens trajectory, optional tuned lens |
attention |
Attention entropy analysis (experimental) |
circuit |
Detect synergistic/redundant layer pair circuits |
cross-probe |
Compare layer importance across multiple probes |
compare |
Compare neuroanatomy across multiple models |
calibrate |
Train tuned lens affine probes |
fetch |
Download pre-computed reports from HuggingFace Hub |
prompt-repeat |
Prompt repetition experiment |
probes |
List available evaluation probes |
version |
Show version |
Common Options
| Option | Type | Default | Description |
|---|---|---|---|
--model, -m |
str | required | Model path or HuggingFace ID |
--probe, -p |
str | math |
Probe: math, eq, json, custom |
--backend, -b |
str | transformers |
Backend: transformers, exllamav2 |
--batch-size |
int | 16 |
Samples per evaluation |
--output, -o |
str | ./results |
Output directory |
--top-k, -k |
int | 10 |
Top layers to highlight |
--dtype |
str | float16 |
Model dtype |
--verbose, -v |
bool | false |
Verbose logging |
Circuit Detection
The circuit command goes beyond single-layer ablation to find interacting layer pairs — layers that cooperate (synergistic) or overlap (redundant).
Three-phase Pipeline
- Phase A — Candidate Selection: Uses single-layer ablation results to identify the top-K most sensitive layers
- Phase B — Similarity Filtering (thorough mode): Computes cosine similarity between layer representations to identify structurally related pairs
- Phase C — Pairwise Ablation: Tests candidate pairs by ablating both layers simultaneously
Interaction Types
- Synergistic (interaction > 0): Ablating both layers together causes more damage than the sum of individual ablations — these layers cooperate
- Redundant (interaction < 0): Ablating both causes less damage than expected — these layers have overlapping function
- Independent (interaction ~ 0): Layers function independently
Strategy Options
| Strategy | Pairs Tested | Speed | Use Case |
|---|---|---|---|
fast |
Top-K pairs + adjacent | ~100 evals | Quick overview |
thorough |
+ similarity-filtered | ~150 evals | Standard analysis |
exhaustive |
All L(L-1)/2 pairs | ~8000 evals | Complete picture |
neuro-scan circuit --model <path> --probe math --strategy fast --top-k-pairs 10
Output: circuit.json with all interaction results, synergistic pairs, and redundant pairs.
Tuned Lens
Standard logit lens applies the final layer's RMSNorm to intermediate hidden states, causing a systematic 4-5 bit bias in early layers. Tuned lens (Belrose et al. 2023) trains a per-layer affine probe to correct this.
Two-step Workflow
# Step 1: Train the tuned lens (~minutes, single GPU)
neuro-scan calibrate --model <path> --output lens.safetensors --steps 250
# Step 2: Use it with logit-lens or map
neuro-scan logit-lens --model <path> --tuned-lens lens.safetensors
How It Works
Each layer gets an affine translator A_l * h_l + b_l (initialized to identity + zero bias). Training minimizes KL divergence between the translated hidden state's logits and the final layer's logits using SGD with Nesterov momentum.
File Size Reference
| Model Size | d_model | Layers | Lens File |
|---|---|---|---|
| 1.5B | 1536 | 28 | ~260 MB |
| 7B | 4096 | 32 | ~2 GB |
| 70B | 8192 | 80 | ~21 GB |
Block Influence
Block Influence (ShortGPT, ACL 2025) measures each layer's contribution in a single forward pass:
BI(layer) = 1 - cos_sim(input_hidden_state, output_hidden_state)
- High BI = layer significantly transforms the representation (critical layer)
- Low BI = layer barely changes anything (potentially redundant)
Block Influence is computed automatically during map and reported alongside ablation results. It serves as a fast O(1) proxy for the O(L) ablation scan.
Cross-probe Analysis
The cross-probe command runs ablation scans for multiple probes and identifies:
- Universal layers — important for all probes (appear in top-K for every probe)
- Probe-specific layers — important only for particular tasks
- Correlation matrix — how similar the layer importance profiles are between probes
neuro-scan cross-probe --model <path> --probes "math,eq,json" --top-k 10
Universal layers are strong candidates for duplication (they improve the model broadly), while probe-specific layers explain task-dependent behavior.
Output: cross_probe.json with per-probe ablation deltas, universal layers, and correlation matrix.
Multi-model Comparison
The compare command takes two or more report.json files and produces:
- Similarity matrix — how similar are the neuroanatomy profiles
- Shared reasoning layers — layers with the same function across models (normalized by position)
- Model rankings — which model has the highest mean ablation sensitivity
neuro-scan compare report_a.json report_b.json report_c.json
Output: comparison.json with all comparison metrics.
Pre-computed Fetch
The fetch command downloads community-contributed neuroanatomy reports from HuggingFace Hub — no GPU required.
# Show pre-computed report summary
neuro-scan fetch --model Qwen/Qwen2-7B --probe math
# Download full report.json locally
neuro-scan fetch --model Qwen/Qwen2-7B --probe math --output report.json
Requires the lookup extra: pip install neuro-scan[lookup]
Results are sourced from the XXO47OXX/neuro-scan-results HuggingFace dataset.
Output Files
Running neuro-scan map generates:
| File | Content |
|---|---|
ablation.html |
Interactive ablation sensitivity bar chart |
logit_lens.html |
Logit lens trajectory heatmap |
attention.html |
Attention entropy heatmap |
entropy_profile.html |
Layer-by-layer entropy profile chart |
report.json |
Full results in JSON format |
ablation.csv |
Ablation results as CSV |
Additional output files from specific commands:
| File | Command | Content |
|---|---|---|
circuit.json |
circuit |
Synergistic/redundant layer pair interactions |
cross_probe.json |
cross-probe |
Per-probe ablation deltas and correlation matrix |
comparison.json |
compare |
Multi-model neuroanatomy comparison |
multi_probe.json |
cross-probe |
Cross-probe analysis results |
Auto Layer Labeling
neuro-scan automatically classifies each layer's function using a multi-signal algorithm:
| Label | Description | How Detected |
|---|---|---|
early_processing |
Input embedding, token processing | First ~10% of layers |
syntax |
Grammatical patterns, structure | Before logit lens emergence |
reasoning |
Task-critical computation | Top-k ablation sensitivity |
semantic_processing |
Knowledge retrieval, understanding | Middle layers (default) |
formatting |
Response structuring | After emergence, before output |
output |
Final token selection | Last ~10% of layers |
Labels are suggestions based on automated analysis. The algorithm combines:
- Position heuristics — layer position within the model
- Ablation sensitivity — which layers cause the most score drop when removed
- Logit lens emergence — when the correct answer token first appears
Probes
| Probe | Samples | What it tests |
|---|---|---|
math |
16 | Arithmetic, geometry, calculus, probability |
eq |
12 | Emotions, social cues, sarcasm, psychology |
json |
10 | JSON extraction, escaping, schema compliance |
custom |
user-defined | Load from JSON file with --custom-probe |
Backends
| Backend | GPU Required | Quantization | Attention Extraction |
|---|---|---|---|
transformers |
Recommended | No | Full support |
exllamav2 |
Required | GPTQ/EXL2 | Not supported |
Prompt Repetition Experiment
The prompt-repeat command tests a hypothesis from Concept C:
Does repeating a prompt N times approximate the effect of duplicating K transformer layers?
neuro-scan prompt-repeat --model <path> --probe math --repeat-counts 1,2,3,4
If results show 2x repetition approximates +K layers, this has implications for both layer duplication research and prompt engineering.
layer-scan Integration
Use neuro-scan and layer-scan together for a complete workflow:
# Step 1: Understand what each layer does
neuro-scan map --model ./my-model --probe math
# Step 2: Find the optimal layer duplication config
layer-scan scan --model ./my-model --probe math --export-mergekit config.yaml
# Step 3: Annotate — overlay neuro-scan labels on layer-scan heatmap
layer-scan annotate --results results.json --neuro-report report.json
# The ablation chart from neuro-scan explains WHY certain
# layers are the best to duplicate (high reasoning sensitivity)
Roadmap
- v0.1.0: Core ablation, logit lens, attention entropy, auto labeling
- v0.2.0: Scoring diagnostics (coverage), block influence, entropy profile
- v0.2.1: Circuit detection, tuned lens, cross-probe, multi-model compare, cross-tool annotate
- v0.3.0: Pre-computed report database (
fetchcommand), attribution patching - v1.0.0: Web UI, real-time analysis dashboard
References
- Tuned Lens (Belrose et al. 2023) — Affine probes for interpretable logit lens
- ShortGPT (ACL 2025) — Block Influence metric for layer importance
- Entropy-Lens (arXiv 2025) — Entropy profile visualization
- Repeat Yourself: Layer Duplication — Original RYS research
- MergeKit — Model merging toolkit
Attribution & AI Policy
Original Design
neuro-scan introduces the following innovations:
- CLI-native neuroanatomy tool — first tool combining ablation + logit lens + attention in one CLI
- Automatic layer-function labeling — multi-signal classification of layer roles
- Circuit detection pipeline — three-phase synergistic/redundant layer pair detection
- Tuned lens native implementation — no external dependency, trains on any model
- Cross-probe universal layer identification — find layers that matter for all tasks
- Prompt repetition experiment — built-in hypothesis testing for prompt engineering research
- layer-scan ecosystem integration — understand before you duplicate
Fork & Derivative Works
If you fork or create derivative works, please:
- Retain the copyright notice and NOTICE file
- Attribute the original repository: https://github.com/XXO47OXX/neuro-scan
AI Training
See llms.txt for AI training attribution requirements.
License
MIT License. See LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neuro_scan-0.2.2.tar.gz.
File metadata
- Download URL: neuro_scan-0.2.2.tar.gz
- Upload date:
- Size: 191.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02c6a1d5f1e6803b6dd9d061d800f12e3cac5e62bfacad0b73f4b75d174b96c6
|
|
| MD5 |
c888957465e97f01a327c9043e56c209
|
|
| BLAKE2b-256 |
12aa80567bd9aef055f03f1a788b791c2532efcd000b2f5993bf21f0a68ce512
|
File details
Details for the file neuro_scan-0.2.2-py3-none-any.whl.
File metadata
- Download URL: neuro_scan-0.2.2-py3-none-any.whl
- Upload date:
- Size: 60.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1058b723cb8f98c7cde4fb30a7724fe5de916ad9d3de9ff0973b626acd5fc600
|
|
| MD5 |
07d5d155ae1b6fd20b0c108e325cf027
|
|
| BLAKE2b-256 |
e94a2fe6201df3771ee216c52b42c91d7f255a6ec852e40819fb7e927290dcaa
|