Skip to main content

Q-Compass architecture — RL-grounded sequence mixing for text, vision, audio, and world modeling

Project description

Quatrix — Q-Compass Architecture

"Where transformers retrieve by similarity, Quatrix navigates by value."

Quatrix is a novel neural architecture that replaces standard multi-head attention with Q-Compass — a sequence mixing mechanism grounded in reinforcement learning theory rather than geometric similarity.

Built by Syed Abdur Rehman Ali (@Abd0r).

Paper: Q-Compass: Grounding Sequence Mixing in Reinforcement Learning Navigation — Zenodo, March 2026.


Core Idea: Q-Compass

Standard attention computes:

Attention(Q, K, V) = softmax(QK^T / sqrt(d_k)) @ V

Four projections (W_Q, W_K, W_V, W_O). Similarity-based routing — attends to what looks similar, retrieves a projected transform of it.

Q-Compass computes:

state  = x @ W_s          # "Where am I?"
action = x @ W_a          # "Where can I go?"
Q(s,a) = softmax(state @ action.T / sqrt(r))
output = W_o(Q(s,a) @ x)  # gather from x directly — no W_V

Three projections (W_s, W_a, W_o). Value-based routing — asks "in state s, how valuable is attending to position a?"

The key removal: No W_V. Content is gathered directly from x, unchanged. All routing intelligence lives in Q(s,a). This forces the model to learn precise navigation rather than compensating for imprecise attention with a learned content transform.

At H=512, r=64: standard attention uses 1,048,576 parameters per layer. Q-Compass uses 327,680 — a 69% reduction in attention-block parameters.

The same block — with or without a causal mask — handles both autoregressive text generation (Q-Compass) and bidirectional image encoding (Q-Compass-Bi). One mechanism, all modalities.


Architecture

QuatrixLM (language model)
├── Token + Positional Embeddings
├── N × QuatrixBlock
│   ├── LayerNorm → QCompass (causal) → residual
│   └── LayerNorm → FFN (GELU) → residual
├── LayerNorm
└── Output Head (tied to embeddings)

QuatrixVision (image encoder)
├── Conv2d patch embedding (16×16 patches → 196 patches per 224×224 image)
├── Positional embeddings
├── M × QCompassBi blocks (bidirectional, no causal mask)
├── LayerNorm
└── Linear projection → LM hidden dim

Modality Support

Modality Module Status
Text QuatrixLM Production
Vision QuatrixVision Production
Audio QuatrixAudio Production
World Model QuatrixWorld Production

Quick Start

pip install quatrix
from quatrix import QuatrixLM, QuatrixConfig

cfg = QuatrixConfig(
    vocab_size=50257,
    hidden_size=512,
    num_layers=7,
    max_seq_len=5120,
    q_rank=64,
    use_vision=True,
)
model = QuatrixLM(cfg)  # ~50M params

import torch
input_ids = torch.randint(0, 50257, (1, 10))
out = model(input_ids)
logits = out['logits']  # [B, L, vocab_size]

# Multimodal
pixel_values = torch.randn(1, 3, 224, 224)
out = model(input_ids, pixel_values=pixel_values)

Berry-Q0 — First Quatrix Model

Berry-Q0 is the first model trained on the Quatrix architecture.

Property Value
Architecture QuatrixLM + QuatrixVision
Parameters ~50M (44M LM + 5.5M Vision + 0.4M projection)
Context 5120 tokens
Modalities Text + Image
Training hardware Single RTX 4050 6GB laptop GPU
Text data ~3.2M samples (web, math, code, reasoning, instruction, alignment)
Image data ~550K image-text pairs (VQAv2, GQA, TextVQA, DocVQA, ScienceQA, CLEVR)
Status GRPO reasoning training in progress

Trained from scratch in three stages: pretraining on ~3.2M mixed text + image samples, supervised finetuning on instruction and reasoning data, and ongoing GRPO reasoning training (R1-style, math domain). Empirical results will be reported in a follow-up paper once training is complete.


Roadmap

Model Modalities Status
Berry-Q0 Text + Vision GRPO training in progress
Berry-Q1 Text + Vision + Audio + World Model Future work

Paper

If you use Quatrix or Q-Compass in your work, please cite:

Syed Abdur Rehman Ali. Q-Compass: Grounding Sequence Mixing in
Reinforcement Learning Navigation. Zenodo, March 2026.
https://zenodo.org/records/19104202

Author

Syed Abdur Rehman Ali

GitHub HuggingFace X


License

OpenRAIL-M — open use with behavioral restrictions (no military use, no mass surveillance). See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quatrix-0.1.2.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quatrix-0.1.2-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file quatrix-0.1.2.tar.gz.

File metadata

  • Download URL: quatrix-0.1.2.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for quatrix-0.1.2.tar.gz
Algorithm Hash digest
SHA256 2c6ae5d57c14297f58c119ede188dfe5a065d6aadb673f8d9e23f05d7699f168
MD5 a7af9ef2ea1f683a23c86d7be9a9b5df
BLAKE2b-256 54c9fd7ac069f9332617078a2d057aa562caf267ff49de9f65d65a14c6b13f66

See more details on using hashes here.

File details

Details for the file quatrix-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: quatrix-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for quatrix-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9a88edc85fab03e5eedf1bf0adb7de6e86f42876dfcf3a9baba90550535a8301
MD5 2fbf420606be95cd58f63183a09d679b
BLAKE2b-256 01a52ca44c50e00891a58cf450cb84506be8417f22525a3de7c27b45b224be0e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page