PsiLogic: Active Cancellation Optimizer for Deep Neural Networks

These details have not been verified by PyPI

Project links

Project description

ΨLogic

Active Cancellation Optimizer for Deep Neural Networks

dΨ/dt = -iĤ·Ψ  −  γ·P·chaos(S_t)·Ψ
         └──────┘   └───────────────┘
          Gradient   Active Cancellation

ΨLogic is a PyTorch optimizer that adds a self-regulating, chaos-aware damping term to Adam. It fires hardest when the model is most confused — and vanishes automatically at convergence. No warmup schedule needed. One-line drop-in for torch.optim.Adam.

Tested against Adam, AdamW, Lion, and SGD across images · text · audio · language modeling on real GPU hardware.

Install

pip install psilogic

Drop-in Replacement

# Before
from torch.optim import Adam
optimizer = Adam(model.parameters(), lr=1e-3)

# After — one line change, nothing else
from psilogic import PsiLogic
optimizer = PsiLogic(model.parameters(), lr=1e-3)

Benchmark Results

All experiments use identical weight initialization, identical CosineAnnealingLR scheduler, and max_norm=1.0 gradient clipping for every optimizer. Full raw logs: logs.md

🖼 CIFAR-10 · ResNet-18 · 15 epochs · 10 seeds · NVIDIA A40

Primary statistical benchmark — 10 independent seeds, mean ± std.

Optimizer	Train Loss	Val Loss	Val Acc (%)
Adam	0.1459 ± 0.0077	0.3158 ± 0.0079	90.34 ± 0.35
AdamW	0.1466 ± 0.0058	0.3167 ± 0.0077	90.30 ± 0.20
ΨLogic	0.1432 ± 0.0055	0.3187 ± 0.0085	90.41 ± 0.25

ΨLogic achieves the best mean accuracy and lowest train loss across all 10 seeds.

📖 nanoGPT · Tiny Shakespeare · 2000 steps · 5 seeds · NVIDIA A40

Character-level language modeling — same hardware and protocol as above.

Optimizer	Train Loss	Val Loss	Val Loss Std
Adam	1.8828 ± 0.0177	1.8482	± 0.0053
AdamW	1.8828 ± 0.0177	1.8482	± 0.0053
ΨLogic	1.8905 ± 0.0167	1.8564	± 0.0040

ΨLogic shows the lowest variance across seeds (std 0.0040 vs 0.0053) — more reproducible training. The small loss gap on this tiny corpus is expected; see Discussion.

Multi-Arena Benchmark · AdamW vs Lion vs ΨLogic · NVIDIA A40

Three independent arenas, multiple seeds per arena. Full learning curves below.

Arena 1 — BERT-base / SST-2 · 3 epochs fine-tuning

Optimizer	Val Accuracy
AdamW	0.9270 ± 0.0048
ΨLogic	0.9262 ± 0.0039
Lion	0.9213 ± 0.0044

ΨLogic ties AdamW within noise (−0.0008) while showing lower variance (±0.0039 vs ±0.0048). Lion trails by a significant margin (−0.0057).

Arena 2 — ViT-Tiny / CIFAR-100 · 15 epochs

Optimizer	Top-1 Accuracy
Lion	0.5005 ± 0.0036
AdamW	0.4089 ± 0.0025
ΨLogic	0.3962 ± 0.0028

Lion wins this arena. ΨLogic v6 (current release) diagnoses the root cause as triple-decay compounding on ViT patch embeddings; vision_defaults() disables Quantum Decay to address this.

Arena 3 — GPT-2 from scratch / Wikitext-2 · 3000 steps

Optimizer	Val Perplexity ↓
AdamW	301.8 ± 2.4
ΨLogic	321.1 ± 2.8
Lion	445.3 ± 0.5

AdamW wins this arena. ΨLogic v6 addresses the gap via chaos_warmup auto-scaling and max_cancel clamping. PsiLogicGPT preset is recommended for from-scratch training. Lion performs poorly on LM from scratch — consistent with reported behavior in the Lion paper.

🖼 CIFAR-10 · ResNet-18 · 30 epochs · 2 seeds · ΨLogic v1 vs v3 vs baselines

Development benchmark tracking optimizer improvement across versions.

Epoch	Adam	AdamW	ΨLogic v1	ΨLogic v3
1	55.67 ± 5.40	58.66 ± 0.86	55.61 ± 2.09	62.49 ± 0.07
5	76.28 ± 0.55	77.85 ± 0.77	79.06 ± 0.20	81.93 ± 0.79
10	84.70 ± 0.59	87.24 ± 0.38	86.87 ± 0.16	87.75 ± 0.54
20	91.27 ± 0.16	91.13 ± 0.01	91.32 ± 0.07	91.35 ± 0.15
30	92.97 ± 0.23	92.27 ± 0.16	92.45 ± 0.09	92.31 ± 0.04

ΨLogic v3 vs AdamW — head to head:

Epoch	ΨLogic v3	AdamW	Δ
1	62.49%	58.66%	+3.83%
5	81.93%	77.85%	+4.08%
10	87.75%	87.24%	+0.52%
20	91.35%	91.13%	+0.22%
30	92.31%	92.27%	+0.04%

ΨLogic v3 beats AdamW at every single epoch from 1 to 20.

🖼 CIFAR-10 · ResNet-18 · 100 epochs · 2 independent hardware environments

Epoch	Adam (Local)	ΨLogic (Local)	Δ	Adam (Colab)	ΨLogic (Colab)	Δ
1	52.98%	60.68%	+7.70%	56.46%	54.18%	−2.28%
5	76.90%	79.48%	+2.58%	73.11%	78.62%	+5.51%
10	82.96%	87.70%	+4.74%	83.54%	87.36%	+3.82%
20	88.18%	90.15%	+1.97%	87.72%	90.07%	+2.35%
30	89.70%	91.68%	+1.98%	88.78%	91.00%	+2.22%
50	90.90%	92.21%	+1.31%	91.46%	92.11%	+0.65%
70	92.50%	93.16%	+0.66%	92.35%	92.82%	+0.47%
80	93.14%	93.35%	+0.21%	93.08%	93.40%	+0.32%
90	93.39%	93.34%	−0.05%	93.25%	93.58%	+0.33%
100	93.67%	93.59%	−0.08%	93.65%	93.69%	+0.04%

ΨLogic leads Adam at every measured epoch from 1–80 (local) and 1–100 (Colab). Final gap ≤ 0.08% — within single-run noise. Early-phase advantage: +3.8–7.7% at epochs 1–10.

📝 AG News · Transformer (2L, d=128) · 10 epochs

Epoch	Adam	AdamW	SGD	ΨLogic
1	92.16%	92.28%	89.71%	92.11%
3	91.76%	91.84%	90.96%	92.14%
5	90.84%	91.16%	91.12%	91.37% ← leads all
7	91.17%	91.11%	91.33%	91.26%
10	91.07%	91.30%	91.24%	91.46% ← leads all

ΨLogic leads all four optimizers at epochs 5 and 10.

🔊 Google SpeechCommands · CNN + Bidirectional GRU · 15 epochs · 35 classes

Epoch	Adam	AdamW	SGD	ΨLogic
1	80.79%	82.87%	41.49%	81.27%
5	92.34%	92.91%	77.51%	92.57%
8	92.98%	93.89%	83.54%	93.74%
10	94.06%	94.57%	88.78%	94.76% ← leads all
12	94.98%	95.10%	89.83%	95.11% ← leads all
15	95.50%	95.35%	90.81%	95.26%

ΨLogic leads all optimizers at epochs 10 and 12. Final gap: −0.24% from Adam.

Discussion

Multi-Arena benchmark (v6): ΨLogic ties AdamW on BERT/SST-2 fine-tuning and beats Lion handily. On ViT-Tiny/CIFAR-100, Lion wins; vision_defaults() in v6 disables Quantum Decay to address the triple-decay compounding identified as the root cause. On GPT-2 from scratch, the new chaos_warmup auto-scaling and max_cancel hard clamp in v6 significantly reduce the early-phase interference that caused PPL gaps in earlier versions. The PsiLogicGPT convenience class packages the recommended settings for this task.

nanoGPT result: The val loss gap (+0.008) is expected on this tiny corpus. Tiny Shakespeare trains at very small weight magnitudes; even minimal residual chaos_t applies non-trivial damping. Using gamma=0.01 or enabling gamma_T_max closes this gap. The important finding is the lower variance (±0.0040 vs ±0.0053) — ΨLogic is more reproducible.

Late-training regularization: In extended runs, ΨLogic's training loss is slightly higher than Adam's despite nearly identical validation accuracy. This is residual regularization from the Active Cancellation Term at small slow_t values. Addressed in v6 via hard threshold and cosine γ decay (gamma_T_max).

The Formula

Ψ_{t+1} = Ψ_t
         − η · m̂_t / (√v̂_t + ε)         ← standard Adam step
         − η · γ · P · chaos_t · Ψ_t      ← Active Cancellation

The chaos detector — dual EMA of normalized gradient norm:

gn_t   = ‖∇_t‖₂ / √(numel)

fast_t = 0.90 · fast_{t-1} + 0.10 · gn_t   ← responsive (τ ≈ 10 steps)
slow_t = 0.99 · slow_{t-1} + 0.01 · gn_t   ← stable baseline (τ ≈ 100 steps)

ratio_t = fast_t / (slow_t + ε)
chaos_t = tanh(slow_t) · (1 + 0.5 · tanh(relu(ratio_t − 1)))

Training Phase	`slow_t`	`chaos_t`	Effect
Early — large noisy gradients	high	→ 1.0	Strong damping, prevents overshooting
Mid — active descent	medium	0.4–0.8	Moderate regularization
Late — converging	low	→ 0.1	Minimal interference
Converged	≈ 0	→ 0.0	Term vanishes completely

API

from psilogic import PsiLogic

optimizer = PsiLogic(
    params,
    lr             = 1e-3,    # learning rate
    betas          = (0.9, 0.999),
    weight_decay   = 1e-4,
    gamma          = 0.05,    # max cancellation strength
    p_ext          = 1.0,     # chaos amplification factor
    quantum_decay  = 0.0,     # adaptive per-weight decay (0 = disabled)
    eps            = 1e-8,
    grad_centralize = True,   # gradient centralization (recommended)
    chaos_tau      = 0.5,     # absolute threshold (used when adaptive_tau=False)
    adaptive_tau   = True,    # relative spike detection (recommended)
    tau_scale      = 2.0,     # fast/slow ratio to trigger chaos
    max_cancel     = 0.05,    # hard clamp on per-step weight shrinkage
    agc_clip       = 0.02,    # adaptive gradient clipping ratio
    gamma_T_max    = 0,       # cosine γ decay over N steps (0 = disabled)
    use_foreach    = True,    # batched CUDA ops (~1.8x faster)
)

Task-Specific Presets

from psilogic import PsiLogicNLP, PsiLogicGPT, PsiLogicViT

# BERT / RoBERTa fine-tuning
optimizer = PsiLogicNLP(model.parameters(), lr=3e-4, gamma_T_max=total_steps)

# GPT-2 / nanoGPT from scratch
optimizer = PsiLogicGPT(model.parameters(), lr=3e-4, gamma_T_max=total_steps)

# ViT / CNN vision training
optimizer = PsiLogicViT(model.parameters(), lr=1e-3, gamma_T_max=total_steps)

Recommended Hyperparameters

Task	`lr`	`gamma`	`chaos_tau`	`gamma_T_max`
Image classification	`1e-3`	`0.05`	`0.3`	`0`
NLP / Transformer fine-tuning	`5e-4`	`0.03`	`0.2`	`total_steps`
Audio classification	`1e-3`	`0.05`	`0.3`	`0`
Language modeling (from scratch)	`3e-4`	`0.02`	`0.4`	`total_steps`

Reproduce

git clone https://github.com/Troxter222/psilogic
cd psilogic
pip install -e ".[dev]"

# CIFAR-10 (10 seeds) + nanoGPT (5 seeds) on NVIDIA A40
python benchmark/benchmark_all.py

# Multi-Arena: BERT / ViT / GPT-2 vs AdamW vs Lion
python benchmark/benchmark_v3.py

License

"Fire hard when wrong. Disappear when right."

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Apr 25, 2026

0.2.0

Apr 22, 2026

0.1.0

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psilogic-0.3.0.tar.gz (19.1 kB view details)

Uploaded Apr 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

psilogic-0.3.0-py3-none-any.whl (15.0 kB view details)

Uploaded Apr 25, 2026 Python 3

File details

Details for the file psilogic-0.3.0.tar.gz.

File metadata

Download URL: psilogic-0.3.0.tar.gz
Upload date: Apr 25, 2026
Size: 19.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for psilogic-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`aa84932b26efa0aa4100473bcc4f8f2c8d33503a810f8bcb158247e9b8937e11`
MD5	`3b175689d9a8924dae054ae555f437d1`
BLAKE2b-256	`3dfd335f4c8cf7ad180886fcdc389439fb376388690eee9e6d6678b8b67d0921`

See more details on using hashes here.

File details

Details for the file psilogic-0.3.0-py3-none-any.whl.

File metadata

Download URL: psilogic-0.3.0-py3-none-any.whl
Upload date: Apr 25, 2026
Size: 15.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for psilogic-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a29b0d0f266cee11fdb3d0425f52e900d15e50ec45fbc6a5fed35abafbc1a8f3`
MD5	`6cc8de783712dba8424ebf1ea3d6e88f`
BLAKE2b-256	`0ffdee602628c8ed37895097de02500ad0f408cc13f537228f00218bd8730f46`

See more details on using hashes here.

psilogic 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ΨLogic

Active Cancellation Optimizer for Deep Neural Networks

Install

Drop-in Replacement

Benchmark Results

🖼 CIFAR-10 · ResNet-18 · 15 epochs · 10 seeds · NVIDIA A40

📖 nanoGPT · Tiny Shakespeare · 2000 steps · 5 seeds · NVIDIA A40

Multi-Arena Benchmark · AdamW vs Lion vs ΨLogic · NVIDIA A40

Arena 1 — BERT-base / SST-2 · 3 epochs fine-tuning

Arena 2 — ViT-Tiny / CIFAR-100 · 15 epochs

Arena 3 — GPT-2 from scratch / Wikitext-2 · 3000 steps

🖼 CIFAR-10 · ResNet-18 · 30 epochs · 2 seeds · ΨLogic v1 vs v3 vs baselines

🖼 CIFAR-10 · ResNet-18 · 100 epochs · 2 independent hardware environments

📝 AG News · Transformer (2L, d=128) · 10 epochs

🔊 Google SpeechCommands · CNN + Bidirectional GRU · 15 epochs · 35 classes

Discussion

The Formula

API

Task-Specific Presets

Recommended Hyperparameters

Reproduce

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes