EgoRA — Entropy-Governed Orthogonality Regularization for Adaptation. Dynamic information-theoretic regularization for fine-tuning neural networks.
Project description
EgoRA — Entropy-Governed Orthogonality Regularization for Adaptation
EgoRA is a dynamic, information-theoretic regularization method for fine-tuning neural networks. It uses the model's own output entropy to modulate an orthogonality penalty on LoRA adapter weights, preventing knowledge destruction and rank collapse during adaptation.
The method is based on the Rotation-Retention Law: knowledge loss in fine-tuned language models is proportional to representational rotation ($\Delta M \propto \bar{\theta}$).
Installation
pip install egora
With diagnostics support (matplotlib, scipy):
pip install egora[diagnostics]
For development:
pip install egora[dev]
Quick Start
Training with EgoRA
from transformers import AutoModelForCausalLM
from egora import apply_lora, get_total_egora_penalty, refresh_all_shadows
from egora import EntropyGovernor
# Load model and apply EgoRA-LoRA adapters
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")
lora_modules = apply_lora(model, rank=16, use_egora=True)
gov = EntropyGovernor(alpha=1.359, lam_floor=0.6931)
# Training loop
for batch in dataloader:
outputs = model(**batch)
task_loss = outputs.loss
# Dynamic entropy-governed penalty
lam = gov.compute_lambda(outputs.logits)
penalty = get_total_egora_penalty(lora_modules)
total_loss = task_loss + lam * penalty
total_loss.backward()
optimizer.step()
optimizer.zero_grad()
# Periodically refresh shadow matrices
refresh_all_shadows(lora_modules, momentum=0.9)
Post-Training Diagnostics
from egora import compute_head_geometry
results = compute_head_geometry(base_model, tuned_model)
print(f"Mean rotation: {results['theta_bar_deg']:.2f} deg")
print(f"Damaged heads: {results['damaged_fraction']*100:.1f}%")
Adapter Types
| Adapter | Class | Description |
|---|---|---|
| EgoRA-LoRA | EgoRALoRALinear |
Standard LoRA + entropy-governed orthogonality penalty |
| DoRA | DoRALinear |
Weight-Decomposed LoRA (Liu et al., 2024) |
| rsLoRA | rsLoRALinear |
Rank-Stabilized LoRA (Kalajdzievski, 2023) |
Command-Line Interface
EgoRA includes a CLI for training, diagnostics, and visualization — no Python code needed:
# Fine-tune a model with EgoRA (one command!)
egora train meta-llama/Llama-3.2-1B tatsu-lab/alpaca --epochs 1 --rank 16 --save
# Fine-tune with all options
egora train meta-llama/Llama-3.2-1B my-dataset \
--rank 16 --lora-alpha 32 --epochs 3 --batch-size 4 --lr 2e-4 \
--max-length 512 --max-samples 1000 --save --merge
# Compare base vs fine-tuned model
egora diagnose meta-llama/Llama-3.2-1B ./my-finetuned-model -o results.json --plot
# Show model architecture info (layers, d_head, θ_crit, LoRA param estimates)
egora info meta-llama/Llama-3.2-1B
# Launch interactive web demo (requires: pip install gradio)
egora demo
egora demo --share # public link
# Version
egora version
Visualization
Generate publication-quality rotation geometry plots (requires pip install egora[diagnostics]):
from egora.plotting import plot_rotation_report, plot_rotation_heatmap
# Combined 4-panel report: heatmap, layer profile, modes, projections
fig = plot_rotation_report(results)
fig.savefig("rotation_report.png", dpi=150)
# Individual plots
fig = plot_rotation_heatmap(results) # layer × projection heatmap
fig = plot_layer_profile(results) # rotation across depth
fig = plot_mode_distribution(results) # preserved/additive/substitutive/damaged
fig = plot_projection_comparison(results) # Q vs K vs V vs O
HuggingFace Trainer Integration (Recommended)
The easiest way to fine-tune with EgoRA — a drop-in replacement for Trainer:
from egora import EgoRATrainer, EgoRATrainingArguments, EgoRALoraConfig
config = EgoRALoraConfig(r=16, lora_alpha=32, use_egora=True)
args = EgoRATrainingArguments(
output_dir="./output",
num_train_epochs=3,
per_device_train_batch_size=4,
learning_rate=2e-4,
)
trainer = EgoRATrainer(
model_name="meta-llama/Llama-3.2-1B",
egora_config=config,
args=args,
train_dataset=dataset,
)
trainer.train()
# Post-training diagnostics
report = trainer.diagnose()
print(f"Mean rotation: {report['theta_bar_deg']:.2f}°")
print(f"Damaged heads: {report['damaged_fraction']*100:.1f}%")
PEFT-Compatible API
For users familiar with HuggingFace PEFT, EgoRA provides a compatible wrapper:
from egora import EgoRAPeftModel, EgoRALoraConfig
config = EgoRALoraConfig(r=16, lora_alpha=32, use_egora=True)
model = EgoRAPeftModel.from_pretrained("meta-llama/Llama-3.2-1B", config)
model.print_trainable_parameters()
# trainable params: 6,553,600 || all params: 1,235,814,400 || trainable%: 0.53
# One-liner training loss
total_loss = model.compute_total_loss(task_loss, logits)
# Save / load adapters
model.save_pretrained("./my-egora-adapter")
model.load_adapter("./my-egora-adapter")
# Merge into base weights for deployment
base_model = model.merge_and_unload()
Key Concepts
- Entropy Governor: Dynamically scales the regularization penalty based on model uncertainty — high entropy (uncertain) → strong penalty, low entropy (confident) → relaxed penalty.
- Shadow Matrix: A pseudo-inverse reference that tracks the adapter's structural orientation, enabling measurement of spectral conditioning.
- Rotation-Retention Law: $\Delta M \propto \bar{\theta}$ — the empirical law linking representational rotation to benchmark performance change. Critical threshold: $\theta_{\text{crit}} \approx 5°$.
- Learning Modes: Per-head classification into additive, substitutive, preserved, or damaged based on rotation angle and magnitude ratio.
API Reference
Core Functions
apply_lora(model, rank, use_egora, ...)— Replace attention projections with LoRA adaptersget_total_egora_penalty(lora_modules)— Sum penalty across all adapter modulesrefresh_all_shadows(lora_modules, momentum)— Update shadow matricesmerge_lora(model, lora_modules)/unmerge_lora(replacements)— Merge/restore adapters
Governor
EntropyGovernor(alpha, lam_floor, ...)— Dynamic scaling controllerGovernorConfig(...)— Configuration dataclass with adaptive alpha support
Diagnostics
compute_head_geometry(base_model, tuned_model)— Per-head rotation analysiscompute_interhead_diversity(model)— Head diversity matrixsave_geometry(results, output_path)— Export results to JSON + numpy
Examples
See the examples/ directory:
quickstart_training.py— End-to-end fine-tuning with EgoRA + post-training diagnosticsdiagnostics_only.py— Standalone rotation analysis between any two checkpointspeft_compat_example.py— PEFT-style API demo with save/load/merge
Resources
| Link | |
|---|---|
| 📦 PyPI | pip install egora |
| 💻 GitHub | ArsSocratica/EgoRA |
| 🤗 Dataset | ArsSocratica/egora-benchmarks — benchmark results, rotation geometry, training curves |
| 🔖 DOI | 10.5281/zenodo.19410504 |
Benchmark Dataset
The full experiment data is published on HuggingFace:
- Llama 3.2 1B / 3B, Llama 3.1 8B — Alpaca + Medical domain, 4 methods (Baseline LoRA, DoRA, EgoRA e², EgoRA adaptive v2)
- Cross-modal — Mistral-7B, Phi-3 Mini
- Rotation geometry — per-head θ, learning modes, knowledge maps, alignment landscapes
- Threshold analysis — Rotation-Retention Law validation, dimensionality threshold, phase transition
# Load with HuggingFace datasets
from datasets import load_dataset
ds = load_dataset("ArsSocratica/egora-benchmarks")
Citation
If you use EgoRA in your research, please cite:
@software{dillerop2026egora,
title={The Rotation-Retention Law: Knowledge Loss Is Proportional to
Representational Rotation in Fine-Tuned Language Models.
With EgoRA: Entropy-Governed Orthogonality Regularization
for Adaptation},
author={Dillerop, Mark},
year={2026},
doi={10.5281/zenodo.19410504},
url={https://zenodo.org/records/19410504}
}
License
This software is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0), with an Additional Permission for Academic Use pursuant to AGPL Section 7.
- Academic use: Free, no copyleft obligations, citation required. See LICENSE-ACADEMIC.
- Commercial use: Requires both a software license and a patent license.
Patent Notice
The methods implemented in this software are covered by U.S. Provisional Patent Application No. 64/024,742 ("Entropy-Governed Orthogonality Regularization for Knowledge-Preserving Neural Network Adaptation and Rotation-Retention Diagnostic Framework"), filed April 1, 2026, by Mark Dillerop.
Academic use is permitted without a separate patent license. Commercial use requires both a software license and a patent license.
Contact: mark@dillerop.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file egora-0.5.1.tar.gz.
File metadata
- Download URL: egora-0.5.1.tar.gz
- Upload date:
- Size: 43.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9dba72e7727e6227402b7bedc3a4dc0ad006ed7dc9534bb9d4ac67aa1f9fc57
|
|
| MD5 |
a4b500222edf9f50ab3ce83c8696aa7a
|
|
| BLAKE2b-256 |
9b83b18d93917941176a15c0154b93835727b9fbc5d81d276b082272b713e184
|
File details
Details for the file egora-0.5.1-py3-none-any.whl.
File metadata
- Download URL: egora-0.5.1-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ab68f20d3be308d6b1e8c1e829ba1eb611e741aa742ceafbb3a71cc944bae17
|
|
| MD5 |
628eb656ebcf38ef7c2bdcd7a997ee4f
|
|
| BLAKE2b-256 |
6691ebe164441756746f4d9db8e7fe66a7c3c80691e7c4767d97713d27f6806a
|