KOLMG-LoRA: Kolmogorov-Arnold Low-Rank Adaptation - a non-linear LoRA variant for kolmgformers and any PyTorch model
Project description
kolmg-lora
KOLMG-LoRA: Kolmogorov-Arnold Low-Rank Adaptation - a fundamentally new LoRA variant that replaces the standard linear bottleneck (B×A) with a two-layer PL-KAN (Piecewise-Linear Kolmogorov-Arnold Network), giving non-linear adaptation capacity at a parameter cost similar to standard LoRA.
Designed as the native fine-tuning method for the kolmgformers package, but works with any PyTorch nn.Module including HuggingFace transformers.
Install
pip install kolmg-lora # PyTorch only
pip install kolmg-lora[kolmgformers] # + kolmgformers integration
pip install kolmg-lora[safetensors] # + fast weight I/O
pip install kolmg-lora[all] # everything
What makes KOLMG-LoRA different?
| Variant | Adapter path | Non-linear | Scaling |
|---|---|---|---|
| LoRA | B×A (linear) | No | α/r |
| rsLoRA | B×A (linear) | No | α/√r |
| LoRA+ | B×A (split LR) | No | α/r |
| DoRA | B×A + magnitude | No | α/r |
| QLoRA | B×A on 4-bit base | No | α/r |
| KOLMG-LoRA | KAN bottleneck | Yes ✓ | α/r (or α/√r) |
| KOLMG-DoRA | KAN + magnitude | Yes ✓ | α/r (or α/√r) |
Standard LoRA learns: ΔW·x = B × A × x - always linear.
KOLMG-LoRA learns: ΔW·x = φ_out(φ_in(x)) where each φ is a mini KAN:
φ(x) = SiLU(x)·W_base + Σ_k c_k · B_k(x)
B_k are B-spline basis functions on a uniform grid - each rank dimension gets its own learned activation shape, something no linear bottleneck can express at any rank.
Quick start
from kolmg_lora import KOLMGLoRAConfig, add_kolmg_lora, merge_lora, save_lora, load_lora
# 1. Configure
cfg = KOLMGLoRAConfig(
rank = 16,
alpha = 32.0,
grid_size = 4, # KAN expressiveness knob (3–8 recommended)
dropout = 0.05,
)
# 2. Apply to any model
model = add_kolmg_lora(model, cfg)
# [kolmg-lora KOLMG-LoRA] 4 layers wrapped | rank=16 | grid=4 | order=1 | trainable: ...
# 3. Train normally - only KAN parameters update
optimizer = torch.optim.AdamW(
filter(lambda p: p.requires_grad, model.parameters()), lr=1e-4
)
# 4. Save adapter (~few MB)
save_lora(model, "./my_adapter")
# 5. Merge for deployment (zero inference overhead)
model = merge_lora(model)
With kolmgformers
from kolmgformers import KOLMOGformerForCausalLM, KOLMOGformerConfig
from kolmg_lora import KOLMGLoRAConfig, add_kolmg_lora
model = KOLMOGformerForCausalLM(KOLMOGformerConfig(
vocab_size=32000, hidden_size=512, num_channels=8, num_layers=6
))
cfg = KOLMGLoRAConfig(rank=16, alpha=32.0, train_ffn=True)
model = add_kolmg_lora(model, cfg)
With HuggingFace transformers
from transformers import AutoModelForCausalLM
from kolmg_lora import add_kolmg_lora
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
model = add_kolmg_lora(model, rank=16, alpha=32.0)
Combining with other techniques
All combinations stack cleanly:
# KOLMG-LoRA + rsLoRA (rank-stabilised scaling α/√r)
cfg = KOLMGLoRAConfig(rank=16, rs_lora=True)
# KOLMG-LoRA + DoRA (magnitude decomposition)
cfg = KOLMGLoRAConfig(rank=16, use_dora=True)
# KOLMG-LoRA + LoRA+ (higher LR for φ_out)
cfg = KOLMGLoRAConfig(rank=16, lora_plus_ratio=16.0)
# LoRA+ with per-layer param groups
groups = []
for m in model.modules():
if isinstance(m, KOLMGLoRALinear):
groups += m.get_lora_plus_param_groups(base_lr=1e-4)
optimizer = torch.optim.AdamW(groups)
Config reference
KOLMGLoRAConfig(
rank = 16, # KAN bottleneck width
alpha = 32.0, # scaling = alpha / rank (or / √rank with rs_lora)
dropout = 0.05, # dropout on adapter input
target_modules = None, # None → ["q_proj","k_proj","v_proj","out"]
train_ffn = False, # also wrap gate/up/down FFN projections
rs_lora = False, # rank-stabilised scaling
use_dora = False, # DoRA magnitude decomposition
lora_plus_ratio = 1.0, # LoRA+ LR multiplier for φ_out
grid_size = 4, # KAN grid intervals (3=fast, 8=expressive)
spline_order = 1, # 1=piecewise-linear (fast), 3=cubic (smooth)
grid_range = (-1., 1.),# KAN input domain
kan_scale_noise = 0.1, # spline weight init noise
kan_scale_base = 1.0, # SiLU base path init scale
)
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kolmg_lora-1.0.1-py3-none-any.whl.
File metadata
- Download URL: kolmg_lora-1.0.1-py3-none-any.whl
- Upload date:
- Size: 10.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ea650f397ea111ad80ebcaa6545535ad038a587cbaed12f005ed02bba8dc396
|
|
| MD5 |
a51014808278a1aa06455f542a1559da
|
|
| BLAKE2b-256 |
d3f7d0f4d2cdf2ce4f90238fb4550d1c457e3a0e22364205e4130760036239e7
|