Stateful Coherent Language Models - Transformers with persistent memory
Project description
SCLM: Stateful Coherent Language Models
SCLM is a PyTorch library for building language models with persistent latent state and multi-expert coherence mechanisms. Unlike standard transformers that process each sequence independently, SCLM maintains continuous memory across generation steps.
๐ฏ Key Features
| Feature | Description |
|---|---|
| Persistent State | Maintains latent state across generation with variance < 10โปโท |
| Coherence Mechanism | Multi-expert system that promotes consistent representations |
| Edit Mode | Local modifications without global semantic drift |
| Drop-in Replacement | Compatible with standard transformer training pipelines |
๐ Experimental Results
| Metric | Result |
|---|---|
| State Persistence | variance < 10โปโท โ |
| Coherence Preservation | 104.7% โ |
| Local Editing Drift | 0.3% โ |
| Entity Preservation | 100% โ |
๐ Installation
pip install saclm
Or from source:
git clone https://github.com/Volgat/sclm.git
cd sclm
pip install -e .
๐ Quick Start
Basic Usage
from sclm import SCLM, SCLMConfig
# Create configuration
config = SCLMConfig(
vocab_size=50257,
n_layers=6,
n_heads=8,
d_model=512
)
# Create model
model = SCLM(config)
# Forward pass
import torch
input_ids = torch.randint(0, 50257, (1, 64))
output = model(input_ids)
logits = output['logits'] # [batch, seq_len, vocab_size]
metrics = output['global_metrics'] # coherence, alignment, etc.
Text Generation
# Generate text
prompt = torch.tensor([[1, 2, 3, 4, 5]]) # Your tokenized prompt
generated = model.generate(
prompt,
max_new_tokens=100,
temperature=0.8,
top_k=50
)
Edit Mode (Key Feature!)
# Process original text
original_ids = tokenizer.encode("The sword was blue.", return_tensors='pt')
model.reset_state()
_ = model(original_ids)
# Freeze state
model.freeze_state()
# Process edited text - coherence preserved!
edited_ids = tokenizer.encode("The sword was red.", return_tensors='pt')
output = model(edited_ids, edit_mode=True)
# Check coherence preservation
print(f"Coherence: {output['global_metrics']['coherence']}")
# Unfreeze when done
model.unfreeze_state()
๐๏ธ Architecture
SCLM introduces the EARCP Layer - a five-stage pipeline integrated into transformer blocks:
Input Hidden States
โ
โโโโโโโโโโโโโโโโโโโโโ
โ E - Encapsulation โ Create/update persistent state
โโโโโโโโโโโฌโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโ
โ A - Alignment โ Measure hidden-state consistency
โโโโโโโโโโโฌโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโ
โ R - Revision โ Correct semantic drift
โโโโโโโโโโโฌโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโ
โ C - Coherence โ Multi-expert processing
โโโโโโโโโโโฌโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโ
โ P - Propagation โ Inject state into deeper layers
โโโโโโโโโโโฌโโโโโโโโโโ
โ
Output Hidden States
Components
| Module | Purpose |
|---|---|
EncapsulationModule |
GRU-style state management |
AlignmentModule |
Cross-attention consistency |
RevisionModule |
Drift detection & correction |
CoherenceModule |
Multi-expert ensemble |
PropagationModule |
Layer-wise state injection |
๐ Configuration Options
@dataclass
class SCLMConfig:
# Model architecture
vocab_size: int = 50257
max_seq_length: int = 512
n_layers: int = 6
n_heads: int = 8
d_model: int = 512
d_ff: int = 2048
dropout: float = 0.1
# SCLM-specific
latent_state_dim: int = 256 # State dimension
n_coherence_heads: int = 4 # Coherence attention heads
n_experts: int = 4 # Number of experts
propagation_depth: int = 3 # Propagation adapters
# EARCP parameters
eta_s: float = 5.0 # Coherence sensitivity
w_min: float = 0.05 # Minimum expert weight
# Layer placement
earcp_every_n_layers: int = 2 # EARCP every N layers
use_global_earcp: bool = True # Global EARCP layer
๐ง Pre-built Models
from sclm import create_sclm_small, create_sclm_medium, create_sclm_large
# ~45M parameters
model_small = create_sclm_small()
# ~125M parameters
model_medium = create_sclm_medium()
# ~350M parameters
model_large = create_sclm_large()
๐ Training Example
from sclm import SCLM, SCLMConfig
import torch
import torch.nn as nn
# Setup
config = SCLMConfig(vocab_size=50257)
model = SCLM(config).cuda()
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
# Training loop
for batch in dataloader:
input_ids, labels = batch
input_ids, labels = input_ids.cuda(), labels.cuda()
# Reset state for each sequence
model.reset_state()
# Forward
output = model(input_ids, labels=labels)
loss = output['loss']
# Backward
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Log metrics
metrics = output['global_metrics']
print(f"Loss: {loss.item():.4f}, Coherence: {metrics['coherence']:.4f}")
๐งช Knowledge Distillation
from transformers import GPT2LMHeadModel
# Teacher model
teacher = GPT2LMHeadModel.from_pretrained('gpt2-large')
teacher.eval()
# Student (SCLM)
student = SCLM(config)
# Distillation training
T = 2.0 # Temperature
alpha = 0.5 # Distillation weight
for batch in dataloader:
input_ids, labels = batch
# Student forward
student.reset_state()
student_out = student(input_ids, labels)
lm_loss = student_out['loss']
# Teacher forward
with torch.no_grad():
teacher_logits = teacher(input_ids).logits
# Distillation loss
student_soft = F.log_softmax(student_out['logits'] / T, dim=-1)
teacher_soft = F.softmax(teacher_logits / T, dim=-1)
distill_loss = F.kl_div(student_soft, teacher_soft, reduction='batchmean') * T * T
# Combined loss
loss = (1 - alpha) * lm_loss + alpha * distill_loss
loss.backward()
๐ Metrics
Access detailed metrics after forward pass:
output = model(input_ids)
# Global EARCP metrics
global_metrics = output['global_metrics']
print(f"Coherence: {global_metrics['coherence']:.4f}")
print(f"Alignment: {global_metrics['alignment'].mean():.4f}")
print(f"Drift: {global_metrics['drift'].mean():.4f}")
print(f"State Norm: {global_metrics['state_norm']:.4f}")
print(f"Expert Weights: {global_metrics['weights']}")
# Per-block metrics
for i, block_metrics in enumerate(output['block_metrics']):
print(f"Block {i}: coherence={block_metrics['coherence']:.4f}")
๐ฌ Research Applications
SCLM is designed for:
- Long-form generation with consistent characters and facts
- Document editing with local changes and global coherence
- Multi-turn dialogue with persistent context
- Story generation with entity tracking
- Code generation with variable consistency
๐ Citation
@article{amega2025sclm,
title={SCLM: Stateful Coherent Language Models},
author={Amega, Mike},
journal={arXiv preprint},
year={2025},
note={github.com/Volgat/sclm}
}
๐ License
Proprietary Community License - see LICENSE for details.
Community Use: Free for personal, research, and small business (< $100k revenue). Commercial Use: License required for larger entities and commercial SaaS products. See LICENSING.
๐ Deployment
To publish a new version to PyPI:
- Update version in
setup.py. - Create a new Release in GitHub.
- The GitHub Action will automatically build and publish the package.
Note: Requires PYPI_API_TOKEN secret in repository settings.
๐ค Contributing
Contributions welcome! Please read our Contributing Guide.
๐ง Contact
- Author: Mike Amega
- Email: contact@amewebstudio.com
- GitHub: @Volgat
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file saclm-1.0.0.tar.gz.
File metadata
- Download URL: saclm-1.0.0.tar.gz
- Upload date:
- Size: 22.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b788f17190f3a43378f99a3e834fd6cd134cc6f08ff47606a9876737574b4956
|
|
| MD5 |
081dfe621fe8c50868765cb9ee9ddd9d
|
|
| BLAKE2b-256 |
b07850b424b036223ede7ea8e4aae377675bbe718225dfec35559925109d90f5
|
Provenance
The following attestation bundles were made for saclm-1.0.0.tar.gz:
Publisher:
publish.yml on Volgat/sclm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
saclm-1.0.0.tar.gz -
Subject digest:
b788f17190f3a43378f99a3e834fd6cd134cc6f08ff47606a9876737574b4956 - Sigstore transparency entry: 763996677
- Sigstore integration time:
-
Permalink:
Volgat/sclm@aee1b670c598f5e99fa0ac107d78378a16adc8ce -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Volgat
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@aee1b670c598f5e99fa0ac107d78378a16adc8ce -
Trigger Event:
release
-
Statement type:
File details
Details for the file saclm-1.0.0-py3-none-any.whl.
File metadata
- Download URL: saclm-1.0.0-py3-none-any.whl
- Upload date:
- Size: 13.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81e6a65d75cba5710f9db84cade2f2d29f77dd0e621166b40711ce5d0a1724a9
|
|
| MD5 |
64935ed2e54383c84e08527facf69642
|
|
| BLAKE2b-256 |
742713b9b565afcee6a476bbae79655c21e8bd8895096958480418688a3e11d5
|
Provenance
The following attestation bundles were made for saclm-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on Volgat/sclm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
saclm-1.0.0-py3-none-any.whl -
Subject digest:
81e6a65d75cba5710f9db84cade2f2d29f77dd0e621166b40711ce5d0a1724a9 - Sigstore transparency entry: 763996678
- Sigstore integration time:
-
Permalink:
Volgat/sclm@aee1b670c598f5e99fa0ac107d78378a16adc8ce -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Volgat
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@aee1b670c598f5e99fa0ac107d78378a16adc8ce -
Trigger Event:
release
-
Statement type: