Skip to main content

Pre-merge diagnostic framework for LLM model merging — analyze, diagnose, and optimize before you merge.

Project description

MergeLens

Pre-merge diagnostics for LLM model merging

PyPI Python License Downloads Tests


34% of top Open LLM Leaderboard models are merges, yet merging is blind trial-and-error. MergeLens tells you before you merge whether it will work — and which method to use.

Features

  • Single compatibility score — Merge Compatibility Index (MCI): 0-100, go/no-go verdict
  • 10 diagnostic metrics — cosine similarity, spectral overlap, sign disagreement, TSV interference, CKA, and more
  • Strategy recommender — optimal merge method + ready-to-paste MergeKit YAML
  • Conflict zone detection — pinpoints problematic layers
  • Interactive HTML reports — self-contained Plotly dashboards
  • MCP server — AI assistants can diagnose merges natively
  • Memory efficient — lazy safetensors loading, peak memory = 2× largest layer

Install

pip install mergelens

Optional extras:

pip install mergelens[report]  # Interactive HTML report generation
pip install mergelens[mcp]    # MCP server for AI assistants
pip install mergelens[audit]  # Capability probing (requires transformers)
pip install mergelens[all]    # Everything

Quick Start

CLI

Compare two models (local paths or HuggingFace Hub IDs):

mergelens compare model_a/ model_b/
mergelens compare meta-llama/Llama-3-8B mistralai/Mistral-7B-v0.1

Add a base model for task vector metrics:

mergelens compare model_a/ model_b/ --base base_model/

Generate an interactive HTML report:

mergelens compare model_a/ model_b/ --report report.html
mergelens compare model_a/ model_b/ --base base_model/ --report report.html

The HTML report is a single self-contained file with embedded Plotly charts — no server required.

Report contents:

  • MCI Gauge — score, verdict, confidence interval
  • MCI Components — per-metric breakdown table
  • Weight Similarity Heatmap — layer × model-pair cosine similarity grid
  • Spectral Analysis Dashboard — spectral overlap, rank ratio, task vector energy, and sign disagreement across layers
  • Layer Divergence Chart — L2 distance (bars) and sign disagreement rate (line) on dual axes
  • Conflict Zone Analysis — bar chart + table with severity, layer range, and recommendation per zone
  • Layer Metrics Table — raw values for all metrics per layer, scrollable
  • Strategy Recommendation — method, confidence, reasoning, and copy-paste MergeKit YAML

Diagnose a MergeKit config before running it:

mergelens diagnose merge.yaml

Python API

from mergelens import compare_models

result = compare_models(["model_a/", "model_b/"])

print(f"MCI: {result.mci.score}{result.mci.verdict}")
# MCI: 72.3 — compatible

Inspect conflicts and get a strategy recommendation:

for zone in result.conflict_zones:
    print(f"Layers {zone.start_layer}-{zone.end_layer}: {zone.severity.value}")

if result.strategy:
    print(f"Recommended: {result.strategy.method.value}")
    print(result.strategy.mergekit_yaml)  # copy-paste into MergeKit

Diagnose a MergeKit config:

from mergelens import diagnose_config

result = diagnose_config("merge.yaml")
print(f"Overall interference: {result.overall_interference:.4f}")

Metrics

Metric What It Measures Range Source
Cosine Similarity Weight vector alignment [-1, 1] Standard
L2 Distance Normalized weight divergence [0, +inf) Standard
KL Divergence Weight distribution difference [0, +inf) Standard
Spectral Subspace Overlap Top-k SVD direction alignment [0, 1] Zhou et al. 2026
Effective Rank Ratio Dimensionality compatibility [0, 1] Shannon entropy
Sign Disagreement Rate Parameter sign conflicts [0, 1] TIES-Merging (Yadav et al. 2023)
TSV Interference Cross-task singular vector conflict [0, +inf) Gargiulo et al. 2025
Task Vector Energy Knowledge concentration in top SVs [0, 1] Choi et al. 2024
CKA Similarity Activation representation similarity [0, 1] Kornblith et al. 2019
Merge Compatibility Index Composite go/no-go score [0, 100] Ours
MCI Verdicts
Score Verdict Meaning
75-100 Highly Compatible Merge with confidence
55-74 Compatible Should work, monitor quality
35-54 Risky Expect degradation, use targeted methods
0-34 Incompatible These models likely shouldn't be merged
Strategy Recommendations

MergeLens maps diagnostic profiles to merge methods. Different metrics predict success for different methods (Zhou et al. 2026 found only 46.7% metric overlap between methods):

Diagnostic Profile Recommended Method
High cosine similarity everywhere SLERP
High sign disagreement (>30%) TIES
Concentrated task vector energy DARE
Low spectral overlap Linear (small alpha)

Each recommendation includes a ready-to-paste MergeKit YAML config.

MCP Integration

{
  "mcpServers": {
    "mergelens": {
      "command": "mergelens",
      "args": ["serve"]
    }
  }
}

Tools: compare_models, diagnose_merge, get_conflict_zones, suggest_strategy, generate_report, explain_layer, get_compatibility_score, audit_model

How It Works

MergeLens loads model weights lazily via memory-mapped safetensors (peak memory: 2× largest layer, not 2× full model). It computes metrics layer-by-layer, detects conflict zones, and aggregates everything into the MCI score.

Security: No pickle/torch.load (safetensors only), yaml.safe_load(), tensor size limits, no credential leakage.

Development

git clone https://github.com/shuhulx/mergelens.git
cd mergelens
pip install -e ".[dev,all]"
pytest

Changelog

0.1.10

  • Docs: Added Report Contents section — all 8 charts listed with descriptions
  • Docs: Added changelog (0.1.7–0.1.9)

0.1.9

  • Added: Layer Divergence chart — dual-axis bar+line showing L2 distance and sign disagreement rate per layer
  • Added: Sign Disagreement trace on Spectral Analysis Dashboard
  • Added: Report contents documentation

0.1.8

  • Fixed: severity_factor dict keys now match lowercase Severity enum values — previously all severities defaulted to 0.5, breaking strategy confidence scoring
  • Fixed: tsv_interference_score returns 0.0 (not NaN) when only one task vector is present
  • Fixed: grassmann_distance returns 1.0 (not NaN) for empty subspaces

0.1.7

  • Fixed: Attribution logger upgraded from warning to error for all-negative cosine similarity case

References

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mergelens-0.1.10.tar.gz (40.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mergelens-0.1.10-py3-none-any.whl (49.3 kB view details)

Uploaded Python 3

File details

Details for the file mergelens-0.1.10.tar.gz.

File metadata

  • Download URL: mergelens-0.1.10.tar.gz
  • Upload date:
  • Size: 40.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for mergelens-0.1.10.tar.gz
Algorithm Hash digest
SHA256 828d34066d85e4cbfc48477bf715bfb6e50c721c0d87199ff4e03eb305da54dd
MD5 e01d9a6facaa143a0df0443b48cb3601
BLAKE2b-256 dce1471e75584e5ab2bf8e58d5321ce66a60e35e3671f1d3f142399c3b20ad87

See more details on using hashes here.

File details

Details for the file mergelens-0.1.10-py3-none-any.whl.

File metadata

  • Download URL: mergelens-0.1.10-py3-none-any.whl
  • Upload date:
  • Size: 49.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for mergelens-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 c252319379cc4fe8f3bdf149a963e2367e0fd09060a1d7ef52f35b874c6e0ea3
MD5 97be77b41219163f2910f17ce69fb1d1
BLAKE2b-256 97c469929a40b0cc74b476589f90656ee96526b7c70c4359fbbbbd0e4831d98c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page