GrafoPropagation v26-APEX — Geometric von Mises-Fisher Networks with WordNet Pre-training
Project description
GrafoPropagation v26-APEX
Status: Proprietary / Confidential Author: Claudio Fernandes License: Proprietary — see LICENSE
Overview
GrafoPropagation is a compact (~990k parameter) text classification architecture built on geometric von Mises-Fisher (vMF) attention with WordNet dictionary pre-training and AG News fine-tuning.
Key Innovations
| Component | Description |
|---|---|
| vMF Dual-Scale Attention | Queries/keys on the unit hypersphere with learnable concentration κ (local × global) |
| Asymmetric Q/K | Separate projections for queries and keys for improved expressivity |
| Riemannian Temporal Embedding | Log-Map + Parallel Transport on the sphere for position-aware dynamics |
| RoPE | Rotary Position Embedding on normalised direction vectors |
| Dynamic GrafoConnect | Learned cross-layer skip-connection graph modulated by curvature |
| Global Workspace Memory | Learnable broadcast slots (Global Workspace Theory) |
| System-2 Latent Search | Iterative branch-evaluate-merge with GumbelMCTS |
| Dictionary Pre-Training | Multi-label BCE over WordNet definitions |
| Quantum LR Modulation | 8-qubit PennyLane circuit for epoch-dependent LR scaling |
Installation
pip install .
This automatically installs all dependencies: PyTorch, PennyLane, HuggingFace datasets/tokenizers, NLTK, Rich.
For GPU-accelerated quantum simulation:
pip install ".[gpu]"
Quick Start
Python API
from grafopropagation import CFG, GrafoPropagation, run_training, build_or_load_tokenizer
# Default configuration (~990k params)
cfg = CFG()
result = run_training(cfg)
# Scale up the architecture
cfg = CFG(d_model=128, n_layers=4, n_heads=8, head_dim=32, d_ff=640)
result = run_training(cfg)
# Full control via dict
cfg = CFG.from_dict({
"d_model": 256,
"n_layers": 6,
"n_heads": 8,
"head_dim": 32,
"d_ff": 1024,
"dict_epochs": 500,
"epochs": 50,
})
print(f"Estimated parameters: {cfg.count_parameters():,}")
result = run_training(cfg)
CLI
# Default training
grafoprop-train
# Scale up architecture
grafoprop-train --d_model 128 --n_layers 4 --n_heads 8 --head_dim 32
# Custom training schedule
grafoprop-train --epochs 50 --batch_size 128 --base_lr_max 0.002
# Export/load config
grafoprop-train --export_config my_config.json
grafoprop-train --config my_config.json
# Disable quantum LR modulation
grafoprop-train --use_quantum_lr false
Configuration
All parameters are exposed in the CFG dataclass. Key scaling dimensions:
| Parameter | Default | Description |
|---|---|---|
d_model |
64 | Model hidden dimension |
n_layers |
2 | Transformer layers |
n_heads |
4 | Attention heads |
head_dim |
16 | Per-head dimension |
d_ff |
320 | Feed-forward inner dim |
K_think |
2 | System-2 thought tokens |
memory_slots |
6 | Global Workspace slots |
latent_actions |
6 | World model actions |
mcts_simulations |
12 | MCTS simulations per step |
dict_epochs |
1500 | WordNet pre-training epochs |
epochs |
30 | Fine-tuning epochs |
Scaling Recipes
| Target | Config |
|---|---|
| ~990k (default) | CFG() |
| ~3M | CFG(d_model=96, n_layers=3, n_heads=6, head_dim=24, d_ff=512) |
| ~7M | CFG(d_model=128, n_layers=4, n_heads=8, head_dim=32, d_ff=640) |
| ~15M | CFG(d_model=192, n_layers=6, n_heads=8, head_dim=48, d_ff=1024) |
| ~30M | CFG(d_model=256, n_layers=8, n_heads=8, head_dim=64, d_ff=1280) |
Training Results (Default ~990k)
| Epoch | Train Acc | Val Acc |
|---|---|---|
| 1 | 67.2% | 25.6% |
| 3 | 90.6% | 84.4% |
| 5 | 92.6% | 91.1% |
| 10 | 94.9% | 92.7% |
| 13 | 95.9% | 93.1% |
| 15 | 96.4% | 93.1% |
Pre-training: 1500 epochs on WordNet definitions, final dict loss: 0.04481
Architecture
Input IDs
│
├── Character-Seeded Token Embedding (Fibonacci sphere init)
│
├── Temporal Transition Embedding (Log-Map + Parallel Transport)
│
├── Local Depthwise Conv Mixer
│
├── Global Workspace Memory (prepend slots)
│
├── ×N TransformerBlock
│ ├── vMF Dual-Scale Attention (RoPE + κ gating)
│ └── SwiGLU FFN (stochastic depth)
│
│ └── Dynamic GrafoConnect (cross-layer skips)
│
├── System-2 Latent Search
│ ├── Branch-Evaluate-Merge iterations
│ └── Gumbel MCTS (soft search / hard search)
│
├── Pooling Fusion (gated think + seq_avg)
│
├── Multi-Sample Dropout Head → class logits
│
└── Multi-Label Head → dict logits (pre-training)
File Structure
grafopropagation/
├── __init__.py # Package exports
├── config.py # CFG dataclass (fully configurable)
├── primitives.py # RMSNorm, trunc_normal, char embeddings
├── positional.py # RoPE, TemporalTransitionEmbedding
├── attention.py # VonMisesFisherAttention
├── transformer.py # LocalConvMix, SwiGLU, TransformerBlock
├── memory.py # GlobalWorkspaceMemory, DynamicGrafoConnect
├── system2.py # WorldModel, PolicyValueHead, GumbelMCTS, System2LatentSearch
├── heads.py # PoolingFusion, MultiSampleDropoutHead, MultiLabelHead
├── model.py # GrafoPropagation (main model)
├── datasets.py # TextDataset, DictionaryDataset, WordNet builder
├── tokenizer_utils.py # BPE tokenizer build/load
├── losses.py # focal_ce, token_dropout
├── optimizer.py # EMA, Lookahead, AWP, GC, WarmupCosineLR
├── quantum.py # Quantum LR modulation (PennyLane)
├── logging_utils.py # Rich-based logging
├── train.py # Full training pipeline
├── cli.py # CLI entry point
└── py.typed # PEP 561 marker
Dependencies
- Python >= 3.9
- PyTorch >= 2.0
- PennyLane >= 0.33
- HuggingFace datasets >= 2.14
- HuggingFace tokenizers >= 0.14
- NLTK >= 3.8
- Rich >= 13.0
- NumPy >= 1.24
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file grafopropagation-0.26.0.tar.gz.
File metadata
- Download URL: grafopropagation-0.26.0.tar.gz
- Upload date:
- Size: 37.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39531931ac6b1f64faeb10f531e3ac199dbf83c34800af2141d7dd70f7955f2c
|
|
| MD5 |
3730785fac54031aa8919bf70b2bbf75
|
|
| BLAKE2b-256 |
a428c86e9d9d0ca64dcb211f3ac820ace7f536c9bfae158c66227a81630983e3
|
File details
Details for the file grafopropagation-0.26.0-py3-none-any.whl.
File metadata
- Download URL: grafopropagation-0.26.0-py3-none-any.whl
- Upload date:
- Size: 43.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a904386a9fc7bd83b64f7d6c046ccbd42f7b3401769067499a95d04841f5f4e5
|
|
| MD5 |
430a65f937a5c8351453fc8597ddf45b
|
|
| BLAKE2b-256 |
3501b52edb35c1a259102c9eed21274debadcdeefbd595bc34d796083a33b439
|