Skip to main content

LLM Neuroanatomy Explorer — map what each transformer layer does

Project description

neuro-scan

LLM Neuroanatomy Explorer — map what each transformer layer does

PyPI Python License CI

Python PyTorch HuggingFace Plotly ExLlamaV2


Ecosystem

Tool What it does Question it answers
layer-scan Find optimal layer duplication config What to do — which layers to duplicate
neuro-scan Map what each layer does Why it works — understand layer functions

layer-scan users are neuro-scan's natural first users: understand your model's layers before you duplicate them.

Ablation Sensitivity

neuro-scan ablation chart — Qwen2-1.5B / math probe

Layer ablation sensitivity for Qwen2-1.5B with math probe. Bars colored by auto-detected function (reasoning, syntax, etc.). Gold stars mark the most critical layers.

Logit Lens Trajectory

neuro-scan logit lens — Qwen2-1.5B / math probe

Logit lens heatmap showing when the correct answer token emerges across layers. Red diamonds mark the emergence point for each sample.

Features

  • Layer Ablation — zero out each layer one-by-one, measure the score impact
  • Logit Lens — project each layer's hidden state to vocabulary space, watch the answer emerge
  • Tuned Lens — per-layer affine probes that reduce early-layer bias by 4-5 bits (Belrose 2023)
  • Attention Entropy — quantify how focused or diffuse each attention head is
  • Circuit Detection — find synergistic and redundant layer pairs via targeted pairwise ablation
  • Block Influence — one forward pass to estimate all layers' importance (ShortGPT BI metric)
  • Cross-probe Analysis — identify universal vs probe-specific important layers
  • Multi-model Comparison — compare neuroanatomy across different models
  • Auto Layer Labeling — automatically classify layers as early_processing, syntax, reasoning, formatting, or output
  • Prompt Repetition Experiment — test whether repeating a prompt N times approximates duplicating K layers
  • Interactive HTML Charts — Plotly-powered visualizations for all analysis types
  • Pre-computed Fetch — download community neuroanatomy reports from HuggingFace Hub (no GPU needed)

Installation

# pipx (recommended, isolated env)
pipx install neuro-scan

# pip
pip install neuro-scan

# For pre-computed report fetch (no GPU required):
pip install neuro-scan[lookup]

Quick Start

# Full neuroanatomy map (recommended)
neuro-scan map --model <path-or-hf-id> --probe math

# Individual analyses
neuro-scan ablate --model <path> --probe math
neuro-scan logit-lens --model <path> --probe math
neuro-scan attention --model <path> --probe math

# Circuit detection
neuro-scan circuit --model <path> --probe math --strategy fast

# Cross-probe analysis
neuro-scan cross-probe --model <path> --probes "math,eq,json"

# Multi-model comparison
neuro-scan compare report_a.json report_b.json

# Tuned Lens (two-step workflow)
neuro-scan calibrate --model <path> --output lens.safetensors
neuro-scan logit-lens --model <path> --tuned-lens lens.safetensors

# Fetch pre-computed results (no GPU needed)
neuro-scan fetch --model Qwen/Qwen2-7B --probe math

# Prompt repetition experiment
neuro-scan prompt-repeat --model <path> --probe math --repeat-counts 1,2,3,4

# Utilities
neuro-scan probes
neuro-scan version

Commands

Command Description
map Full neuroanatomy (ablation + logit lens + attention + labeling)
ablate Layer ablation sensitivity scan
logit-lens Logit lens trajectory, optional tuned lens
attention Attention entropy analysis (experimental)
circuit Detect synergistic/redundant layer pair circuits
cross-probe Compare layer importance across multiple probes
compare Compare neuroanatomy across multiple models
calibrate Train tuned lens affine probes
fetch Download pre-computed reports from HuggingFace Hub
prompt-repeat Prompt repetition experiment
probes List available evaluation probes
version Show version

Common Options

Option Type Default Description
--model, -m str required Model path or HuggingFace ID
--probe, -p str math Probe: math, eq, json, custom
--backend, -b str transformers Backend: transformers, exllamav2
--batch-size int 16 Samples per evaluation
--output, -o str ./results Output directory
--top-k, -k int 10 Top layers to highlight
--dtype str float16 Model dtype
--verbose, -v bool false Verbose logging

Circuit Detection

The circuit command goes beyond single-layer ablation to find interacting layer pairs — layers that cooperate (synergistic) or overlap (redundant).

Three-phase Pipeline

  1. Phase A — Candidate Selection: Uses single-layer ablation results to identify the top-K most sensitive layers
  2. Phase B — Similarity Filtering (thorough mode): Computes cosine similarity between layer representations to identify structurally related pairs
  3. Phase C — Pairwise Ablation: Tests candidate pairs by ablating both layers simultaneously

Interaction Types

  • Synergistic (interaction > 0): Ablating both layers together causes more damage than the sum of individual ablations — these layers cooperate
  • Redundant (interaction < 0): Ablating both causes less damage than expected — these layers have overlapping function
  • Independent (interaction ~ 0): Layers function independently

Strategy Options

Strategy Pairs Tested Speed Use Case
fast Top-K pairs + adjacent ~100 evals Quick overview
thorough + similarity-filtered ~150 evals Standard analysis
exhaustive All L(L-1)/2 pairs ~8000 evals Complete picture
neuro-scan circuit --model <path> --probe math --strategy fast --top-k-pairs 10

Output: circuit.json with all interaction results, synergistic pairs, and redundant pairs.

Tuned Lens

Standard logit lens applies the final layer's RMSNorm to intermediate hidden states, causing a systematic 4-5 bit bias in early layers. Tuned lens (Belrose et al. 2023) trains a per-layer affine probe to correct this.

Two-step Workflow

# Step 1: Train the tuned lens (~minutes, single GPU)
neuro-scan calibrate --model <path> --output lens.safetensors --steps 250

# Step 2: Use it with logit-lens or map
neuro-scan logit-lens --model <path> --tuned-lens lens.safetensors

How It Works

Each layer gets an affine translator A_l * h_l + b_l (initialized to identity + zero bias). Training minimizes KL divergence between the translated hidden state's logits and the final layer's logits using SGD with Nesterov momentum.

File Size Reference

Model Size d_model Layers Lens File
1.5B 1536 28 ~260 MB
7B 4096 32 ~2 GB
70B 8192 80 ~21 GB

Block Influence

Block Influence (ShortGPT, ACL 2025) measures each layer's contribution in a single forward pass:

BI(layer) = 1 - cos_sim(input_hidden_state, output_hidden_state)
  • High BI = layer significantly transforms the representation (critical layer)
  • Low BI = layer barely changes anything (potentially redundant)

Block Influence is computed automatically during map and reported alongside ablation results. It serves as a fast O(1) proxy for the O(L) ablation scan.

Cross-probe Analysis

The cross-probe command runs ablation scans for multiple probes and identifies:

  • Universal layers — important for all probes (appear in top-K for every probe)
  • Probe-specific layers — important only for particular tasks
  • Correlation matrix — how similar the layer importance profiles are between probes
neuro-scan cross-probe --model <path> --probes "math,eq,json" --top-k 10

Universal layers are strong candidates for duplication (they improve the model broadly), while probe-specific layers explain task-dependent behavior.

Output: cross_probe.json with per-probe ablation deltas, universal layers, and correlation matrix.

Multi-model Comparison

The compare command takes two or more report.json files and produces:

  • Similarity matrix — how similar are the neuroanatomy profiles
  • Shared reasoning layers — layers with the same function across models (normalized by position)
  • Model rankings — which model has the highest mean ablation sensitivity
neuro-scan compare report_a.json report_b.json report_c.json

Output: comparison.json with all comparison metrics.

Pre-computed Fetch

The fetch command downloads community-contributed neuroanatomy reports from HuggingFace Hub — no GPU required.

# Show pre-computed report summary
neuro-scan fetch --model Qwen/Qwen2-7B --probe math

# Download full report.json locally
neuro-scan fetch --model Qwen/Qwen2-7B --probe math --output report.json

Requires the lookup extra: pip install neuro-scan[lookup]

Results are sourced from the XXO47OXX/neuro-scan-results HuggingFace dataset.

Output Files

Running neuro-scan map generates:

File Content
ablation.html Interactive ablation sensitivity bar chart
logit_lens.html Logit lens trajectory heatmap
attention.html Attention entropy heatmap
entropy_profile.html Layer-by-layer entropy profile chart
report.json Full results in JSON format
ablation.csv Ablation results as CSV

Additional output files from specific commands:

File Command Content
circuit.json circuit Synergistic/redundant layer pair interactions
cross_probe.json cross-probe Per-probe ablation deltas and correlation matrix
comparison.json compare Multi-model neuroanatomy comparison
multi_probe.json cross-probe Cross-probe analysis results

Auto Layer Labeling

neuro-scan automatically classifies each layer's function using a multi-signal algorithm:

Label Description How Detected
early_processing Input embedding, token processing First ~10% of layers
syntax Grammatical patterns, structure Before logit lens emergence
reasoning Task-critical computation Top-k ablation sensitivity
semantic_processing Knowledge retrieval, understanding Middle layers (default)
formatting Response structuring After emergence, before output
output Final token selection Last ~10% of layers

Labels are suggestions based on automated analysis. The algorithm combines:

  1. Position heuristics — layer position within the model
  2. Ablation sensitivity — which layers cause the most score drop when removed
  3. Logit lens emergence — when the correct answer token first appears

Probes

Probe Samples What it tests
math 16 Arithmetic, geometry, calculus, probability
eq 12 Emotions, social cues, sarcasm, psychology
json 10 JSON extraction, escaping, schema compliance
custom user-defined Load from JSON file with --custom-probe

Backends

Backend GPU Required Quantization Attention Extraction
transformers Recommended No Full support
exllamav2 Required GPTQ/EXL2 Not supported

Prompt Repetition Experiment

The prompt-repeat command tests a hypothesis from Concept C:

Does repeating a prompt N times approximate the effect of duplicating K transformer layers?

neuro-scan prompt-repeat --model <path> --probe math --repeat-counts 1,2,3,4

If results show 2x repetition approximates +K layers, this has implications for both layer duplication research and prompt engineering.

layer-scan Integration

Use neuro-scan and layer-scan together for a complete workflow:

# Step 1: Understand what each layer does
neuro-scan map --model ./my-model --probe math

# Step 2: Find the optimal layer duplication config
layer-scan scan --model ./my-model --probe math --export-mergekit config.yaml

# Step 3: Annotate — overlay neuro-scan labels on layer-scan heatmap
layer-scan annotate --results results.json --neuro-report report.json

# The ablation chart from neuro-scan explains WHY certain
# layers are the best to duplicate (high reasoning sensitivity)

Roadmap

  • v0.1.0: Core ablation, logit lens, attention entropy, auto labeling
  • v0.2.0: Scoring diagnostics (coverage), block influence, entropy profile
  • v0.2.1: Circuit detection, tuned lens, cross-probe, multi-model compare, cross-tool annotate
  • v0.3.0: Pre-computed report database (fetch command), attribution patching
  • v1.0.0: Web UI, real-time analysis dashboard

References

Attribution & AI Policy

Original Design

neuro-scan introduces the following innovations:

  • CLI-native neuroanatomy tool — first tool combining ablation + logit lens + attention in one CLI
  • Automatic layer-function labeling — multi-signal classification of layer roles
  • Circuit detection pipeline — three-phase synergistic/redundant layer pair detection
  • Tuned lens native implementation — no external dependency, trains on any model
  • Cross-probe universal layer identification — find layers that matter for all tasks
  • Prompt repetition experiment — built-in hypothesis testing for prompt engineering research
  • layer-scan ecosystem integration — understand before you duplicate

Fork & Derivative Works

If you fork or create derivative works, please:

  1. Retain the copyright notice and NOTICE file
  2. Attribute the original repository: https://github.com/XXO47OXX/neuro-scan

AI Training

See llms.txt for AI training attribution requirements.

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuro_scan-0.2.2.tar.gz (191.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neuro_scan-0.2.2-py3-none-any.whl (60.2 kB view details)

Uploaded Python 3

File details

Details for the file neuro_scan-0.2.2.tar.gz.

File metadata

  • Download URL: neuro_scan-0.2.2.tar.gz
  • Upload date:
  • Size: 191.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for neuro_scan-0.2.2.tar.gz
Algorithm Hash digest
SHA256 02c6a1d5f1e6803b6dd9d061d800f12e3cac5e62bfacad0b73f4b75d174b96c6
MD5 c888957465e97f01a327c9043e56c209
BLAKE2b-256 12aa80567bd9aef055f03f1a788b791c2532efcd000b2f5993bf21f0a68ce512

See more details on using hashes here.

File details

Details for the file neuro_scan-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: neuro_scan-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 60.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for neuro_scan-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1058b723cb8f98c7cde4fb30a7724fe5de916ad9d3de9ff0973b626acd5fc600
MD5 07d5d155ae1b6fd20b0c108e325cf027
BLAKE2b-256 e94a2fe6201df3771ee216c52b42c91d7f255a6ec852e40819fb7e927290dcaa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page