Model weight compression and streaming decode library
Project description
helix-substrate
Model weight compression and streaming decode library. Compress neural network weights into a compact format (CDNA), then run matrix operations directly from the compressed representation — without ever loading the full weight matrix into memory.
What it does
-
CDNA Format — Quantize model weights into a 256-entry codebook + uint8 indices, with per-block brotli compression and SHA256 verification. A 14GB model becomes ~6.6GB.
-
Streaming Block Decode — Compute
Y = X @ Wwhere W is stored in CDNA format. W is never fully loaded. Instead, blocks of rows are decompressed one at a time, multiplied against the corresponding slice of X, and accumulated. Typical memory savings: 8-16x per projection. -
Structural Entropy (Se) Routing — Measure tensor complexity via
Se = H × U × D(entropy × unstructuredness × rank depth). The Se score maps to a compute routing decision: simple tensors → CPU, parallel tensors → GPU, complex unstructured tensors → QPU. -
Receipts — Every operation produces a tamper-evident receipt with SHA256 input/output hashes, timing, memory usage, and fidelity metrics. If you can't verify it, it didn't happen.
Install
pip install -e .
Required: numpy. Optional: brotli (for CDNAv2 block compression).
Quick start
Compress a tensor
import numpy as np
from helix_substrate import encode_tensor_to_cdna, decode_cdna_to_tensor
# Compress
W = np.random.randn(4096, 4096).astype(np.float32)
encode_tensor_to_cdna(W, "weight.cdna", tensor_name="my_layer")
# Decompress
W_decoded = decode_cdna_to_tensor("weight.cdna")
print(f"Cosine similarity: {np.dot(W.flat, W_decoded.flat) / (np.linalg.norm(W) * np.linalg.norm(W_decoded)):.6f}")
Measure tensor complexity
from helix_substrate import compute_tensor_se
result = compute_tensor_se(W)
print(f"Se={result['Se']:.3f} → route to {result['routing_hint']}")
# Se=0.42 → route to gpu
Streaming decode (CDNAv2)
from helix_substrate.stream_matmul import stream_xw_from_cdna
# Y = X @ W, but W is never fully loaded
X = np.random.randn(1, 256, 4096).astype(np.float32)
Y, receipt = stream_xw_from_cdna(X, "weight.cdna2.hxz")
print(f"Memory savings: {receipt.savings_factor:.1f}x")
Package structure
helix_substrate/
├── __init__.py # Public API
├── cdna_encoder.py # CDNA v1 encode/decode (k-means quantization)
├── cdna_reader.py # CDNA v2 reader (block-indexed, brotli, SHA256)
├── sidecar.py # HXZO outlier sidecar (high-precision corrections)
├── stream_matmul.py # Core: Y = X @ W from CDNA (streaming, never loads W)
├── stream_attention.py # Full attention layer (Q,K,V,O all streamed)
├── stream_ffn.py # Full FFN layer (gate,up,down all streamed)
├── stream_block.py # Full transformer block (attention + FFN + norms)
├── rope.py # Rotary Position Embeddings
├── se.py # Structural Entropy estimator and routing
└── receipt.py # Tamper-evident execution receipts
The Se formula
Structural Entropy decomposes tensor complexity into three independent factors:
| Component | Measures | High means |
|---|---|---|
| H (entropy) | Singular value spread | Energy spread across many directions |
| U (unstructuredness) | Neighbor coherence | No spatial correlation between adjacent rows |
| D (depth) | Effective rank ratio | Many dimensions matter |
Se = H × U × D produces a 0-1 score. The 2D routing policy uses (Se, C_struct) jointly:
- Zone 1: Se < 0.30, structured → CPU
- Zone 2: 0.30 ≤ Se < 0.70 → GPU
- Zone 3: Se ≥ 0.70, unstructured → QPU
- Zone 4: Se ≥ 0.70, structured → GPU
Inspiration
The mathematical patterns in this library draw from nature — Fibonacci sequences in the block structure, golden-ratio-inspired codebook initialization, and structural entropy as a measure of order vs chaos in weight matrices. The thesis: nature already solved the compression math we need, because it's the world's largest dataset.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file helix_substrate-0.1.0.tar.gz.
File metadata
- Download URL: helix_substrate-0.1.0.tar.gz
- Upload date:
- Size: 58.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
417497b9a383cbadc2d8f78ab7c8cbe482222afe3690114e8ed3c6cdc679effc
|
|
| MD5 |
bf3a59dd54ba09a9079997067d55dd1b
|
|
| BLAKE2b-256 |
c6f1fa4658352724844161f920ea23a087dde78735dead161d75c7eaafdf2c56
|
File details
Details for the file helix_substrate-0.1.0-py3-none-any.whl.
File metadata
- Download URL: helix_substrate-0.1.0-py3-none-any.whl
- Upload date:
- Size: 62.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a22ba653e8d6dac4eabf519ee2416963cf866bb4cfdc3a5a281ba45e75575f5a
|
|
| MD5 |
f8f64cbfb7766fa495fd4a94c579c75d
|
|
| BLAKE2b-256 |
45478d8257bdf25209c479559770d4509de59b8aeeb372620c8ee64e39c2f9fd
|