Skip to main content

Two-parameter Weibull diagnostic framework for transformer weight distributions (companion library for arXiv:2605.18898)

Project description

NPM-Weibull

arXiv PyPI Python License

Companion code and benchmark database for the paper:

A Two-Parameter Weibull Framework for Diagnosing Transformer Weight Distributions Tiexin Ding (Independent Researcher) arXiv:2605.18898 (doi:10.48550/arXiv.2605.18898)

Overview

This repository hosts the open-source artifacts described in the paper:

  • npm-weibull-py v0.4: A pip-installable Python library for fitting and benchmarking Weibull (k, λ) parameters on transformer weight matrices. Eight diagnostic functions (F1--F8) for cross-family comparison, body--tail ablation, paired-correlation analysis, and architecture classification.
  • DATABASE_v9_1: Per-component Weibull fits for 12 model entries across 7 architectural families (Pythia 70M/160M/410M/1B/6.9B, OLMo-1, OLMo-2, LLaMA-3, Mistral, Qwen2.5-7B/14B, Qwen3-8B), with per-layer and per-component breakdowns.
  • Reproducibility examples (planned): Jupyter notebooks reproducing key paper figures.

Status

Phase 2 release (May 2026): library source, benchmark database, examples, and tests are now available.

Component Status
Paper information and citation ✅ Available
npm-weibull-py v0.4 library source ✅ Available (npm_weibull/)
DATABASE_v9_1 benchmark (12 entries) ✅ Available (Python module + CSV)
Quickstart examples ✅ Available (examples/)
Tests ✅ Available (tests/, 12 passing)
Pip-installable release on PyPI 🚧 Planned
API reference documentation 🚧 Planned

Install

pip install npm-weibull-py

# Optional extras
pip install "npm-weibull-py[torch]"   # transformers + safetensors for checkpoint extraction
pip install "npm-weibull-py[plot]"    # matplotlib for plotting helpers

For a development install (clone the repository, edit source, run tests):

git clone https://github.com/tiexinding/NPM-Weibull-public.git
cd NPM-Weibull-public
pip install -e ".[dev]"   # adds pytest, pytest-cov, ruff, mypy

Requires Python ≥ 3.9. Core dependencies are numpy and scipy only.

Quick start

from npm_weibull import weibull_fit, DATABASE_v9_1, compare_to_benchmark

# F1 — fit Weibull to a weight magnitude histogram
fit = weibull_fit({"edges": edges, "hist": counts}, trim="mid_80")
print(fit["k"], fit["lambda"], fit["R2"])

# Layer B — compare user-side per-component median k to the 12-entry benchmark
user = {
    "arch": {"arch": "GQA", "n_q": 32, "n_kv": 8},
    "median_k_per_kind": {"q": 1.14, "k": 1.13, "v": 1.19, "o": 1.19},
}
print(compare_to_benchmark(user)["nearest_neighbor"])

See examples/ for three runnable demos covering F1 fit, benchmark comparison, and F3/F5 trajectory decomposition.

Repository layout

NPM-Weibull-public/
├── npm_weibull/           # library (F1-F8 + workflow + benchmark)
│   ├── core/              # F1 weibull, F5 trajectory, F6_ext distfree, F8 architecture, ...
│   ├── utils/             # closed-form, histogram, cascade reader, KS/AIC
│   ├── workflow/          # diagnose_model wrapper (Layer A)
│   └── benchmark/         # DATABASE_v9_1 + compare_to_benchmark (Layer B)
├── tests/                 # synthetic + integration tests (12 passing)
├── examples/              # 01 synthetic fit, 02 benchmark, 03 trajectory
├── database_v9_1/         # populate_database_v9_1.py + generated CSV/MD
├── pyproject.toml         # pip install config (v0.4.0)
└── README.md

Quick Reference (from the paper)

Initialization anchor (Appendix A.1)

Half-Normal initialization yields a deterministic Weibull (k₀, λ₀) anchor under middle-80% probability-plot fit:

  • k₀ ≈ 1.2054 (universal across vendors and σ_init scales)
  • λ₀ ≈ 0.8875 · σ_init (initialization-scheme-specific)

Verified at step-0 across 5 Pythia sizes within 0.13% relative error.

Two functional classes (Section 2.2)

  • Transmission Class (W_o, FFN modules W_gate, W_up, W_down for SwiGLU; W_FFN_in, W_FFN_out for GeLU): the shape parameter k stays within the band [1.186, 1.204] across architectures (cross-family CV = 0.51%, n = 12 entries).
  • Selection Class (W_q, W_k): departs from the Weibull anchor during training; departure severity tracks attention storage architecture:
    • Separately-stored MHA (OLMo-1, OLMo-2): k ∈ [0.76, 0.99] (deep Selection)
    • GQA (LLaMA-3, Mistral, Qwen2.5, Qwen3): k ∈ [1.10, 1.16] (mild Selection)
    • Merged W_qkv (Pythia): k ∈ [1.05, 1.18] (transitional, tracks T/τ monotonically)

λ scaling within Pythia (Section 5.4)

Terminal mean λ across the three Transmission Class kinds scales with √(η/λ_wd):

  • Pearson r = 0.94 (n = 5 Pythia sizes)
  • Linear fit through origin: λ = 0.087 · √(η/λ_wd)

Directionally consistent with the AdamW steady-state scaling analysis of Fan et al. (2025).

Citation

@misc{ding2026weibull,
  title         = {A Two-Parameter Weibull Framework for Diagnosing Transformer Weight Distributions},
  author        = {Ding, Tiexin},
  year          = {2026},
  eprint        = {2605.18898},
  archivePrefix = {arXiv},
  primaryClass  = {cs.LG},
  doi           = {10.48550/arXiv.2605.18898},
  url           = {https://arxiv.org/abs/2605.18898}
}

License

Code and data in this repository are released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, matching the arXiv submission license.

Contact

Questions, collaboration, or feedback:

  • Email: tiexinding@gmail.com
  • GitHub issues: please use this repository's Issues tab (after content upload)

Repository identifier note: the NPM-Weibull name is the stable library and repository identifier introduced in early development. The paper title ("A Two-Parameter Weibull Framework for Diagnosing Transformer Weight Distributions") reflects the framework's empirical, methodology-first identity adopted in the final draft.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

npm_weibull_py-0.4.0.tar.gz (36.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

npm_weibull_py-0.4.0-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file npm_weibull_py-0.4.0.tar.gz.

File metadata

  • Download URL: npm_weibull_py-0.4.0.tar.gz
  • Upload date:
  • Size: 36.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for npm_weibull_py-0.4.0.tar.gz
Algorithm Hash digest
SHA256 3d8bf3f225e491f0e6876129c3d241084a6ed24f8087f18b51afe4d2308585b6
MD5 9ee95eece5c5d84c1b02e326dc4fbf4f
BLAKE2b-256 7112cf661bdc4a6771b3e21bb1b26cf466ba56e6b811d95c43841ef188736769

See more details on using hashes here.

File details

Details for the file npm_weibull_py-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: npm_weibull_py-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 30.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for npm_weibull_py-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0eebc7fd3ba8fe1600363a76cecf6174847906bab2b6a716fea4db73c9687bda
MD5 dfe9421e4f127542536642e7d13aa692
BLAKE2b-256 8f743b9243a56bdc51572507d22fb65880946b979bc42d6e1bf4a491de36f779

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page