Automated LLM layer duplication configuration scanner with heatmap visualization
Project description
layer-scan
Automated LLM layer duplication config scanner — find the optimal (i,j) for any model + task.
Given an open-source LLM and an evaluation probe, layer-scan finds the layer duplication config that maximizes model capability without modifying weights.
Features
- Full (i,j) scanning — automated search across all valid configs
- Logit distribution scoring — deterministic, no text generation needed
- Multi-probe analysis — Pareto-optimal configs across tasks
- Sparse-then-dense — two-phase scanning for large models
- mergekit export — one-click YAML for mergekit passthrough
- Cross-tool annotation — overlay neuro-scan labels
- Interactive heatmaps — Plotly with hover details
Install
pipx install layer-scan
# or
pip install layer-scan
# ExLlamaV2 for 70B+ models:
pip install layer-scan[exllamav2]
Quick Start
# Scan with math probe
layer-scan scan --model Qwen/Qwen2-7B --probe math
# Scan + export mergekit config
layer-scan scan --model Qwen/Qwen2-7B --probe math --export-mergekit config.yaml
# Multi-probe Pareto analysis
layer-scan multi-probe --model Qwen/Qwen2-7B --probes "math,eq,json"
# Annotate with neuro-scan labels
layer-scan annotate --results results.json --neuro-report report.json
# ExLlamaV2 for large models
layer-scan scan --model /models/qwen2-72b-exl2 --backend exllamav2 --gpu-split "22000,22000"
# Then merge
mergekit-yaml config.yaml ./merged-model --copy-tokenizer
Commands
| Command | Description |
|---|---|
scan |
Scan (i,j) configs with a single probe |
multi-probe |
Cross-probe scan, Pareto-optimal configs |
annotate |
Overlay neuro-scan labels on heatmap |
lookup |
Fetch pre-computed results from HF Hub |
probes |
List available probes |
Key Options
| Option | Default | Description |
|---|---|---|
--model, -m |
required | Model path or HF ID |
--probe, -p |
math |
Probe: math, eq, json, custom |
--backend, -b |
transformers |
transformers or exllamav2 |
--min-block |
7 |
Min duplicated block size |
--top-k, -k |
5 |
Top configs to report |
--sparse-first |
off | Sparse scan then refine |
--export-mergekit |
— | Export as mergekit YAML |
Probes
| Probe | Samples | Tests |
|---|---|---|
math |
16 | Arithmetic, geometry, calculus |
eq |
12 | Emotions, social cues |
json |
10 | JSON extraction, schema |
custom |
variable | Load from JSON file |
Backends
| Backend | Best for | Multi-GPU |
|---|---|---|
transformers |
Small-medium models | — |
exllamav2 |
70B+ quantized | --gpu-split |
References
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file layer_scan-0.2.2.tar.gz.
File metadata
- Download URL: layer_scan-0.2.2.tar.gz
- Upload date:
- Size: 114.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5c12c6a8fb42370a1d880dc181dc1bf949307072f10224798af72e11371346a
|
|
| MD5 |
17beef24d7d03a1aba220a53592a6191
|
|
| BLAKE2b-256 |
79c68bde8e8a5cb620737f17f67cca645fa46951a1a2a1206dee809f08d5b810
|
File details
Details for the file layer_scan-0.2.2-py3-none-any.whl.
File metadata
- Download URL: layer_scan-0.2.2-py3-none-any.whl
- Upload date:
- Size: 46.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2bfc48d6c645dfc4f14ca11862af3594fa0d16c82ad004c5cacca2304741b19
|
|
| MD5 |
6503c4fb3d153026ac21e5a83c7b2ce9
|
|
| BLAKE2b-256 |
4e8f937a71abe2c46d5ddc005b1bc851f459e4ecf7bf5f889d71c3364c2bd34d
|