Q-Compass architecture — RL-grounded sequence mixing for text, vision, audio, and world modeling
Project description
Quatrix — Q-Compass Architecture
"Where transformers retrieve by similarity, Quatrix navigates by value."
Quatrix is a novel neural architecture that replaces standard multi-head attention with Q-Compass — a sequence mixing mechanism grounded in reinforcement learning theory rather than geometric similarity.
Built by Syed Abdur Rehman Ali (@Abd0r).
Paper: Q-Compass: Grounding Sequence Mixing in Reinforcement Learning Navigation — Zenodo, March 2026.
Core Idea: Q-Compass
Standard attention computes:
Attention(Q, K, V) = softmax(QK^T / sqrt(d_k)) @ V
Four projections (W_Q, W_K, W_V, W_O). Similarity-based routing — attends to what looks similar, retrieves a projected transform of it.
Q-Compass computes:
state = x @ W_s # "Where am I?"
action = x @ W_a # "Where can I go?"
Q(s,a) = softmax(state @ action.T / sqrt(r))
output = W_o(Q(s,a) @ x) # gather from x directly — no W_V
Three projections (W_s, W_a, W_o). Value-based routing — asks "in state s, how valuable is attending to position a?"
The key removal: No W_V. Content is gathered directly from x, unchanged. All routing intelligence lives in Q(s,a). This forces the model to learn precise navigation rather than compensating for imprecise attention with a learned content transform.
At H=512, r=64: standard attention uses 1,048,576 parameters per layer. Q-Compass uses 327,680 — a 69% reduction in attention-block parameters.
The same block — with or without a causal mask — handles both autoregressive text generation (Q-Compass) and bidirectional image encoding (Q-Compass-Bi). One mechanism, all modalities.
Architecture
QuatrixLM (language model)
├── Token + Positional Embeddings
├── N × QuatrixBlock
│ ├── LayerNorm → QCompass (causal) → residual
│ └── LayerNorm → FFN (GELU) → residual
├── LayerNorm
└── Output Head (tied to embeddings)
QuatrixVision (image encoder)
├── Conv2d patch embedding (16×16 patches → 196 patches per 224×224 image)
├── Positional embeddings
├── M × QCompassBi blocks (bidirectional, no causal mask)
├── LayerNorm
└── Linear projection → LM hidden dim
Modality Support
| Modality | Module | Status |
|---|---|---|
| Text | QuatrixLM |
Production |
| Vision | QuatrixVision |
Production |
| Audio | QuatrixAudio |
Production |
| World Model | QuatrixWorld |
Production |
Quick Start
pip install quatrix
from quatrix import QuatrixLM, QuatrixConfig
cfg = QuatrixConfig(
vocab_size=50257,
hidden_size=512,
num_layers=7,
max_seq_len=5120,
q_rank=64,
use_vision=True,
)
model = QuatrixLM(cfg) # ~50M params
import torch
input_ids = torch.randint(0, 50257, (1, 10))
out = model(input_ids)
logits = out['logits'] # [B, L, vocab_size]
# Multimodal
pixel_values = torch.randn(1, 3, 224, 224)
out = model(input_ids, pixel_values=pixel_values)
Berry-Q0 — First Quatrix Model
Berry-Q0 is the first model trained on the Quatrix architecture.
| Property | Value |
|---|---|
| Architecture | QuatrixLM + QuatrixVision |
| Parameters | ~50M (44M LM + 5.5M Vision + 0.4M projection) |
| Context | 5120 tokens |
| Modalities | Text + Image |
| Training hardware | Single RTX 4050 6GB laptop GPU |
| Text data | ~3.2M samples (web, math, code, reasoning, instruction, alignment) |
| Image data | ~550K image-text pairs (VQAv2, GQA, TextVQA, DocVQA, ScienceQA, CLEVR) |
| Status | GRPO reasoning training in progress |
Trained from scratch in three stages: pretraining on ~3.2M mixed text + image samples, supervised finetuning on instruction and reasoning data, and ongoing GRPO reasoning training (R1-style, math domain). Empirical results will be reported in a follow-up paper once training is complete.
Roadmap
| Model | Modalities | Status |
|---|---|---|
| Berry-Q0 | Text + Vision | GRPO training in progress |
| Berry-Q1 | Text + Vision + Audio + World Model | Future work |
Paper
If you use Quatrix or Q-Compass in your work, please cite:
Syed Abdur Rehman Ali. Q-Compass: Grounding Sequence Mixing in
Reinforcement Learning Navigation. Zenodo, March 2026.
https://zenodo.org/records/19104202
Author
Syed Abdur Rehman Ali
License
OpenRAIL-M — open use with behavioral restrictions (no military use, no mass surveillance). See LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quatrix-0.1.2.tar.gz.
File metadata
- Download URL: quatrix-0.1.2.tar.gz
- Upload date:
- Size: 17.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c6ae5d57c14297f58c119ede188dfe5a065d6aadb673f8d9e23f05d7699f168
|
|
| MD5 |
a7af9ef2ea1f683a23c86d7be9a9b5df
|
|
| BLAKE2b-256 |
54c9fd7ac069f9332617078a2d057aa562caf267ff49de9f65d65a14c6b13f66
|
File details
Details for the file quatrix-0.1.2-py3-none-any.whl.
File metadata
- Download URL: quatrix-0.1.2-py3-none-any.whl
- Upload date:
- Size: 18.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a88edc85fab03e5eedf1bf0adb7de6e86f42876dfcf3a9baba90550535a8301
|
|
| MD5 |
2fbf420606be95cd58f63183a09d679b
|
|
| BLAKE2b-256 |
01a52ca44c50e00891a58cf450cb84506be8417f22525a3de7c27b45b224be0e
|