Weight-Change DFA: a real-time diagnostic for self-modifying systems
Project description
wcdfa
Weight-Change DFA — a real-time diagnostic for self-modifying systems.
Monitors the temporal structure of how a neural network modifies its own weights during training. Detects ordered-phase drift (sealing/rigidity), disordered-phase drift (dissolving/instability), and maintained criticality — using the DFA scaling exponent on the weight-update magnitude time series.
α > ~1.2 → Ordered (sealing). System is consolidating rigidly.
α ≈ 1.0 → Critical. Simultaneously robust and flexible.
α < ~0.8 → Disordered (dissolving). System cannot consolidate.
Unlike activity-based DFA, weight-change DFA measures self-modification dynamics directly and is not confounded by input statistics.
Install
pip install wcdfa # numpy only
pip install wcdfa[torch] # + PyTorch integration
pip install wcdfa[all] # + PyTorch + matplotlib
Quick start
from wcdfa import WeightChangeDFA
monitor = WeightChangeDFA(window=500)
for epoch in range(num_epochs):
train_one_epoch(model, optimizer)
monitor.update(model)
if monitor.ready:
print(f"Epoch {epoch}: α = {monitor.alpha:.3f} ({monitor.regime})")
Three lines in your training loop. That's it.
What it measures
At each training step, wcdfa computes ||ΔW|| — the Frobenius norm of the weight update across all parameter tensors. Over a rolling window, it applies Detrended Fluctuation Analysis (DFA) to this time series to extract the scaling exponent α.
The exponent tells you about the temporal structure of self-modification:
- α ≈ 1.0 — the weight updates have 1/f scaling (long-range correlations balanced with flexibility). The system is at criticality.
- α > 1.2 — the updates are too regular, too correlated. The system is sealing into the ordered phase. Consolidation is winning over flexibility. The basin is deepening without widening.
- α < 0.8 — the updates are too noisy, too uncorrelated. The system cannot consolidate. The basin is widening without deepening.
This is different from loss curves and gradient norms. A network can have a perfectly flat loss curve (it solved the task) while weight-change DFA shows α = 2.0 (it solved the task through rigid, sealed dynamics). The grokking literature confirms this: networks generalize ~12,000 epochs before their weight dynamics reach criticality.
API
WeightChangeDFA
monitor = WeightChangeDFA(
window=500, # samples before first DFA computation
stride=100, # recompute every N updates
thresholds=(1.2, 0.8), # (ordered, disordered) boundaries
)
| Property | Type | Description |
|---|---|---|
monitor.alpha |
float | None |
Current DFA exponent |
monitor.regime |
str | None |
'ordered', 'critical', or 'disordered' |
monitor.ready |
bool |
Whether enough data for DFA |
monitor.history |
list[float] |
All computed α values |
monitor.n_samples |
int |
Samples collected so far |
| Method | Description |
|---|---|
monitor.update(model) |
Record weight change from PyTorch model |
monitor.update(norm) |
Record pre-computed ||ΔW|| value |
monitor.reset() |
Clear all data |
monitor.get_signal() |
Return the weight-update time series |
RollingWeightChangeDFA
Extended version with epoch-level tracking, plotting, and logging:
from wcdfa import RollingWeightChangeDFA
monitor = RollingWeightChangeDFA(window=500)
for epoch in range(num_epochs):
for batch in dataloader:
train_step(model, batch, optimizer)
monitor.update(model)
monitor.end_epoch(epoch)
# After training
monitor.plot() # one-line visualization
monitor.plot(save_path="dfa.png") # save to file
epochs, alphas = monitor.epoch_history
print(monitor.summary())
Weights & Biases integration:
import wandb
wandb.init(project="my-training-run")
monitor = RollingWeightChangeDFA(window=500)
for epoch in range(num_epochs):
train(model, optimizer)
monitor.update(model)
monitor.log_wandb(step=epoch) # logs alpha + regime to wandb
compute_dfa
Standalone DFA computation for any 1D signal:
from wcdfa import compute_dfa
alpha, scales, fluctuations = compute_dfa(signal, min_box=4, max_box=None)
# With R² goodness-of-fit (how clean is the scaling?)
alpha, scales, fluctuations, r_sq = compute_dfa(signal, return_r_squared=True)
print(f"α = {alpha:.3f}, R² = {r_sq:.3f}") # R² > 0.95 = clean scaling
Non-PyTorch usage
If you're using JAX, TensorFlow, or any other framework, compute ||ΔW|| yourself and pass it in:
monitor = WeightChangeDFA(window=500)
for step in range(num_steps):
# Your training step here
weight_norm = compute_your_weight_change_norm()
monitor.update(weight_norm)
Interpreting results
The therapeutic window
The relationship between perturbation frequency and α traces a full phase transition:
Zero correction: α ≈ 1.84 (deep ordered phase — sealed)
2% correction: α ≈ 1.55 (27% of total effect from first 2%)
~40% correction: α ≈ 1.02 (criticality)
95% correction: α ≈ 0.80 (disordered phase — dissolved)
The first increment of corrective perturbation has an outsized effect. The most dangerous configuration for a self-modifying system is not insufficient correction but zero correction.
Four failure modes
| Mode | α signature | Description | AI manifestation |
|---|---|---|---|
| Sealed return | α > 1.5 | Depth without width | Catastrophic forgetting, value lock-in |
| Dissolved return | α < 0.8 | Width without depth | Random exploration, training instability |
| Captured return | α ≈ 1.0 | Healthy dynamics, wrong target | Reward hacking, mesa-optimization |
| Return against self | α ≈ 1.0 | Healthy dynamics, self-directed | Adversarial vulnerability |
Note: the captured return and return against self are not detectable by α alone — they require measuring the coupling between the system's attractor and its intended objective.
Key finding: 95.5% clean-step retention
When perturbation steps are stripped from the analysis, 95.5% of the DFA effect persists. The gap changes how the system modifies itself between perturbation events, not just during them.
Examples
See examples/ for:
grokking_example.py— Reproducing the two-transition finding in modular additionbasic_usage.py— Minimal PyTorch integration
Validation
The DFA implementation is validated against nolds, an established DFA package:
| Signal | N | wcdfa α | nolds α | Δ |
|---|---|---|---|---|
| White noise | 1000 | 0.539 | 0.494 | 0.045 |
| Brownian | 1000 | 1.481 | 1.434 | 0.047 |
| Pink 1/f | 1000 | 0.985 | 0.988 | 0.003 |
Over 50 white noise trials (N=1000): mean |Δ| = 0.018, max |Δ| = 0.057. Small differences are expected — nolds and wcdfa use slightly different scale selection strategies. Agreement within 0.05 is excellent for DFA.
Performance notes
Memory: The PyTorch integration clones all trainable parameters every step to compute ||ΔW||. For large models this adds memory overhead — roughly equal to the model's parameter memory on CPU. For models over ~1B parameters, compute ||ΔW|| directly from the optimizer state:
# Large-model approach: compute norm from optimizer state
monitor = WeightChangeDFA(window=500)
for step in range(num_steps):
optimizer.zero_grad()
loss.backward()
# Compute ||ΔW|| from gradients × learning rate (approximation)
total_sq = sum(
(p.grad * lr).square().sum().item()
for p in model.parameters() if p.grad is not None
)
norm = total_sq ** 0.5
optimizer.step()
monitor.update(norm)
Speed: DFA computation runs on a numpy array of length window (default 500). At 20 log-spaced scales, this takes <1ms — negligible compared to a training step. The bottleneck for large models is the parameter cloning, not the DFA.
Frozen parameters: Only parameters with requires_grad=True are tracked. Fine-tuning setups where most layers are frozen work correctly.
Gradient accumulation: If you use gradient accumulation, call monitor.update() after optimizer.step(), not after every loss.backward(). Between optimizer steps the weights don't change, producing zero norms that corrupt the DFA signal.
Background
Weight-change DFA was developed as part of a research program on constitutive gap dependence — the requirement that self-modifying systems periodically leave their operating regime and return to maintain dynamical criticality. The metric was introduced in:
Kogura, J. S. (2026). Does Your Model Need Sleep? Constitutive Gap Dependence and the Stability Problem in Self-Modifying AI.
The two-transition finding in grokking (generalization precedes criticality by ~12,000 epochs) was reported in:
Kogura, J. S. (2026). Grokking Precedes Criticality: Weight-Change DFA Reveals a Delayed Phase Transition in Generalizing Networks.
The theoretical framework is developed in:
Kogura, J. S. (2026). Constitutive Gap Dependence: A Temporal Mechanism for Criticality Maintenance in Self-Modifying Systems. Submitted to J. R. Soc. Interface.
Kogura, J. S. (2026). The Arriving Breath: A Philosophical Conspiracy — The Temporal Ground of Caring. ISBN 979-8-9954717-0-7.
More at caring-gap.com.
Citation
If you use wcdfa in your research, please cite:
@software{kogura2026wcdfa,
author = {Kogura, Jimi Sadaki},
title = {wcdfa: Weight-Change Detrended Fluctuation Analysis},
year = {2026},
url = {https://github.com/jimikogura/wcdfa},
}
@article{kogura2026sleep,
author = {Kogura, Jimi Sadaki},
title = {Does Your Model Need Sleep? Constitutive Gap Dependence and the Stability Problem in Self-Modifying AI},
year = {2026},
doi = {10.5281/zenodo.19389821},
}
@article{kogura2026grokking,
author = {Kogura, Jimi Sadaki},
title = {Grokking Precedes Criticality: Weight-Change DFA Reveals a Delayed Phase Transition in Generalizing Networks},
year = {2026},
}
License
MIT. Use it, build on it, cite it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wcdfa-0.1.0.tar.gz.
File metadata
- Download URL: wcdfa-0.1.0.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ccf104c0f217f049936a3037a3fe3e0efa7f139cd5c2af759928ebca4ae793a0
|
|
| MD5 |
cabfa8b71a0338ae3d851457ffa8b72e
|
|
| BLAKE2b-256 |
7a8d15b50c0969b7a884a18b21a3a174e86fc07f54a0dad7e6df4eaa46aebc16
|
Provenance
The following attestation bundles were made for wcdfa-0.1.0.tar.gz:
Publisher:
publish.yml on jimikogura/wcdfa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wcdfa-0.1.0.tar.gz -
Subject digest:
ccf104c0f217f049936a3037a3fe3e0efa7f139cd5c2af759928ebca4ae793a0 - Sigstore transparency entry: 1230995064
- Sigstore integration time:
-
Permalink:
jimikogura/wcdfa@e5ca36180a77e3b6b51114c22febd5697433e452 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/jimikogura
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e5ca36180a77e3b6b51114c22febd5697433e452 -
Trigger Event:
release
-
Statement type:
File details
Details for the file wcdfa-0.1.0-py3-none-any.whl.
File metadata
- Download URL: wcdfa-0.1.0-py3-none-any.whl
- Upload date:
- Size: 19.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ecfaab542b70240048d49dd9bcc917d453b1e524cf7556fece7e417c07539c64
|
|
| MD5 |
5d8396db1fe0f8b0e299a9f245db90d1
|
|
| BLAKE2b-256 |
20ded7994e6b81b8d6070c22ff0118c0eb855e86f22c0d0e7a5524402dd4e74d
|
Provenance
The following attestation bundles were made for wcdfa-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on jimikogura/wcdfa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wcdfa-0.1.0-py3-none-any.whl -
Subject digest:
ecfaab542b70240048d49dd9bcc917d453b1e524cf7556fece7e417c07539c64 - Sigstore transparency entry: 1230995213
- Sigstore integration time:
-
Permalink:
jimikogura/wcdfa@e5ca36180a77e3b6b51114c22febd5697433e452 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/jimikogura
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e5ca36180a77e3b6b51114c22febd5697433e452 -
Trigger Event:
release
-
Statement type: