Skip to main content

Model weight compression and streaming decode library

Project description

helix-substrate

Model weight compression and streaming decode library. Compress neural network weights into a compact format (CDNA), then run matrix operations directly from the compressed representation — without ever loading the full weight matrix into memory.

What it does

  1. CDNA Format — Quantize model weights into a 256-entry codebook + uint8 indices, with per-block brotli compression and SHA256 verification. A 14GB model becomes ~6.6GB.

  2. Streaming Block Decode — Compute Y = X @ W where W is stored in CDNA format. W is never fully loaded. Instead, blocks of rows are decompressed one at a time, multiplied against the corresponding slice of X, and accumulated. Typical memory savings: 8-16x per projection.

  3. Structural Entropy (Se) Routing — Measure tensor complexity via Se = H × U × D (entropy × unstructuredness × rank depth). The Se score maps to a compute routing decision: simple tensors → CPU, parallel tensors → GPU, complex unstructured tensors → QPU.

  4. Receipts — Every operation produces a tamper-evident receipt with SHA256 input/output hashes, timing, memory usage, and fidelity metrics. If you can't verify it, it didn't happen.

Install

pip install -e .

Required: numpy. Optional: brotli (for CDNAv2 block compression).

Quick start

Compress a tensor

import numpy as np
from helix_substrate import encode_tensor_to_cdna, decode_cdna_to_tensor

# Compress
W = np.random.randn(4096, 4096).astype(np.float32)
encode_tensor_to_cdna(W, "weight.cdna", tensor_name="my_layer")

# Decompress
W_decoded = decode_cdna_to_tensor("weight.cdna")
print(f"Cosine similarity: {np.dot(W.flat, W_decoded.flat) / (np.linalg.norm(W) * np.linalg.norm(W_decoded)):.6f}")

Measure tensor complexity

from helix_substrate import compute_tensor_se

result = compute_tensor_se(W)
print(f"Se={result['Se']:.3f} → route to {result['routing_hint']}")
# Se=0.42 → route to gpu

Streaming decode (CDNAv2)

from helix_substrate.stream_matmul import stream_xw_from_cdna

# Y = X @ W, but W is never fully loaded
X = np.random.randn(1, 256, 4096).astype(np.float32)
Y, receipt = stream_xw_from_cdna(X, "weight.cdna2.hxz")
print(f"Memory savings: {receipt.savings_factor:.1f}x")

Package structure

helix_substrate/
├── __init__.py          # Public API
├── cdna_encoder.py      # CDNA v1 encode/decode (k-means quantization)
├── cdna_reader.py       # CDNA v2 reader (block-indexed, brotli, SHA256)
├── sidecar.py           # HXZO outlier sidecar (high-precision corrections)
├── stream_matmul.py     # Core: Y = X @ W from CDNA (streaming, never loads W)
├── stream_attention.py  # Full attention layer (Q,K,V,O all streamed)
├── stream_ffn.py        # Full FFN layer (gate,up,down all streamed)
├── stream_block.py      # Full transformer block (attention + FFN + norms)
├── rope.py              # Rotary Position Embeddings
├── se.py                # Structural Entropy estimator and routing
└── receipt.py           # Tamper-evident execution receipts

The Se formula

Structural Entropy decomposes tensor complexity into three independent factors:

Component Measures High means
H (entropy) Singular value spread Energy spread across many directions
U (unstructuredness) Neighbor coherence No spatial correlation between adjacent rows
D (depth) Effective rank ratio Many dimensions matter

Se = H × U × D produces a 0-1 score. The 2D routing policy uses (Se, C_struct) jointly:

  • Zone 1: Se < 0.30, structured → CPU
  • Zone 2: 0.30 ≤ Se < 0.70 → GPU
  • Zone 3: Se ≥ 0.70, unstructured → QPU
  • Zone 4: Se ≥ 0.70, structured → GPU

Inspiration

The mathematical patterns in this library draw from nature — Fibonacci sequences in the block structure, golden-ratio-inspired codebook initialization, and structural entropy as a measure of order vs chaos in weight matrices. The thesis: nature already solved the compression math we need, because it's the world's largest dataset.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helix_substrate-0.1.0.tar.gz (58.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

helix_substrate-0.1.0-py3-none-any.whl (62.5 kB view details)

Uploaded Python 3

File details

Details for the file helix_substrate-0.1.0.tar.gz.

File metadata

  • Download URL: helix_substrate-0.1.0.tar.gz
  • Upload date:
  • Size: 58.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for helix_substrate-0.1.0.tar.gz
Algorithm Hash digest
SHA256 417497b9a383cbadc2d8f78ab7c8cbe482222afe3690114e8ed3c6cdc679effc
MD5 bf3a59dd54ba09a9079997067d55dd1b
BLAKE2b-256 c6f1fa4658352724844161f920ea23a087dde78735dead161d75c7eaafdf2c56

See more details on using hashes here.

File details

Details for the file helix_substrate-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for helix_substrate-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a22ba653e8d6dac4eabf519ee2416963cf866bb4cfdc3a5a281ba45e75575f5a
MD5 f8f64cbfb7766fa495fd4a94c579c75d
BLAKE2b-256 45478d8257bdf25209c479559770d4509de59b8aeeb372620c8ee64e39c2f9fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page