V9 Palace-structured attention for Transformers — block-diagonal palace-collated reasoning, per-palace LoRA, cross-palace gating
Project description
transformers-v9
V9 Palace-structured attention for HuggingFace Transformers — block-diagonal palace-collated reasoning, per-palace LoRA adapters, cross-palace gating.
Install
pip install transformers-v9
Quick Start
import transformers_v9 # registers v9_palace with Auto Classes
from transformers import AutoConfig, AutoModelForCausalLM
config = AutoConfig.from_pretrained("v9_palace", vocab_size=32000, hidden_size=4096)
model = AutoModelForCausalLM.from_pretrained(config)
Architecture
- PalaceAttentionLayer: Low-rank QKV (r=64) + Palace block-diagonal mask + per-palace LoRA K/V (r=9, 38 palaces) + cross-palace gating
- 6 residual layers (stacked on frozen base LLM), no FFN — base model provides it
- ~50.4M trainable params at hidden_size=4096
Training
from transformers_v9 import V9PalaceTrainer
trainer = V9PalaceTrainer(model, train_dataset, eval_dataset)
trainer.train()
Requirements
- Python 3.8+
- PyTorch 2.0+
- transformers 4.30+
- accelerate 0.20+
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
transformers_v9-0.1.0.tar.gz
(9.5 kB
view details)
File details
Details for the file transformers_v9-0.1.0.tar.gz.
File metadata
- Download URL: transformers_v9-0.1.0.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bcd56d53ea855defc217a3436f490135775ecc1d46bb151466b3a94503a1cfbc
|
|
| MD5 |
f68faab9dcd5bd54dbcbcd1dc1e8cd64
|
|
| BLAKE2b-256 |
c2865848c225b9257ee585840ac7f502196579ca72e59416dfe38be84ae05963
|